tag:blogger.com,1999:blog-30938901737123371632014-10-07T06:49:32.671+02:00Plausible ReasoningApplied to problems big and smalljplnoreply@blogger.comBlogger10125tag:blogger.com,1999:blog-3093890173712337163.post-46444986904595140292010-01-11T18:31:00.004+01:002010-01-11T19:12:18.443+01:00The real definition of probability<p>Over centuries many great minds have pondered over the meaning of probability, trying to apply it to profound questions such as "is harvest likely to be good this year", "how likely is my stock portfolio going to appreciate", "will I get a hangover after drinking this bottle", etc. Some even went as far as to claim that the term "probability" is undefinable using simpler concepts and nevertheless fearlessly proceeded to derive all the correct rules for relating one probability to another without ever revealing the secret of determining the value of either one.</p><p>Thanks to the power of Interwebs and Inkscape you no longer have to wonder along. Probability is just glorified <strong>counting</strong> and <strong>taking ratios</strong> (e.g. counting things of one type within the set of things of another type). In the end, even the supposedly more general Bayesian view on probability reduces to just one elementary operation, counting. This means that with enough perseveration one can reduce any probabilistic problem to counting balls in an imaginary urn. Like so (click to enlarge):</p>
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_o_c25p_Zq5Y/S0tggbggNEI/AAAAAAAAABw/3xApyJhccI4/s1600-h/probability-counting.png"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 223px; height: 400px;" src="http://3.bp.blogspot.com/_o_c25p_Zq5Y/S0tggbggNEI/AAAAAAAAABw/3xApyJhccI4/s400/probability-counting.png" border="0" alt="Definition of probability in terms of counting" id="BLOGGER_PHOTO_ID_5425536286354060354" /></a>
<p>By the way, when you hear talk about "prior information", what is really meant is "counts". The natural questions is to ask <em>which</em> counts exactly. If they can't tell you, they are safe to ignore. Also, keep in mind that some counts don't count as much as the others. Probably...</p><p>(Coming soon: the real definition of maximum entropy - it's all about counting too!)</p><p>More seriously, it may be helpful to realize that every probabilistic model <em>corresponds exactly</em> to some such urn-based setup. Any reasoning using the urn model can be mapped back to the situation described by the probabilistic model - and vice versa. Moreover, urn-based setups may be transformed formally into one another while preserving their meaning. While it's difficult to juggle probabilistic formulae in one's mind, ball-filled urns are quite easy to visualize and quick-check for surprising contents. The continuous probability case also fits nicely by imagining the "going into the limit process", that is, shrinking balls ad infinitum.</p><p>Shrinking balls. Oh well...</p>jplnoreply@blogger.com0tag:blogger.com,1999:blog-3093890173712337163.post-91384815276257102422009-10-31T17:26:00.003+01:002009-10-31T17:34:46.641+01:00The curious word 'why'The same little word "why" may be used to obtain a causal explanation ("by what process has X come to be?") or a goal ("what purpose does X serve, as a part of a larger mechanism?"). These are obviously quite opposite ways of looking at things - backwards vs. forwards in time - one of them neutral, the other postulating agency. So I wonder, why does the word "why" combine both meanings and what are the consequences for everyday reasoning? Is it so in all languages, or do some exist in which inquiries as to cause and purpose <em>must</em> be represented by two distinct question words?jplnoreply@blogger.com0tag:blogger.com,1999:blog-3093890173712337163.post-26077166017334976392009-08-31T22:36:00.002+02:002009-08-31T23:03:34.293+02:00Accelerating Genetic Engineering<p>Throughout the history of science, those disciplines capable of controlled experimentation have advanced rapidly (e.g. physics) while those with limited capability of this sort have hardly made any progress in comparison (e.g. social sciences or economics). It would be no exaggeration to say that the tweak-it-and-see-what-happens-then approach is <em>the</em> key to gaining insight into how systems operate and how they can be changed to our advantage. Beyond science, it also appears to be the base principle for learning of any kind (consider language development in children, for example). For efficiency, it is crucial that the tweaking occurs in a controlled fashion, <em>ceteris paribus</em>, completely at our will, not disturbed by what is known as "confounding factors" in statistics.</p>
<p>Consider debugging computer software as another example (or troubleshooting of any kind, if you are not into software). If the computer program under inspection only changes its behavior in response to the programmer's modifications and inputs under her control, then the task of understanding and shaping it into whatever form is desired is mostly trivial. However, if there are unknown varying inputs that influence the program's behavior on each run, which mask the programmer's corrective actions, the debugging task becomes a nightmare, or at least calls for statistical analysis (not commonly available to real-life programmers). The same sort of problems arises if the modifications available to the programmer are too coarse-grained, e.g. if she can only replace large (and needed) components rather than "dig inside" and fix them.</p>
<p>It appears that researchers in Genetic Engineering have very recently made a breakthrough by gaining the ability to not just <em>observe</em>, but also tweak their "programs" in a piecewise, controlled fashion. Watch this presentation by Craig Venter to learn more: <a href="http://www.edge.org/3rd_culture/church_venter09/church_venter09_index.html#video">From Darwin to New Fuels (In A Very Short Time)</a>. They now expect that the progress will be greatly accelerated by this capability, and looking at the history, there is every reason to believe them. The potential for grim accidents is also there, of the same sort which is present in software systems. The same tweak-and-see techniques that are so helpful in offline development environments can wreak havoc when (or rather if) applied in production systems. (Most) programmers are smart enough to make the distinction. The same must be expected from genetic engineers.</p>jplnoreply@blogger.com0tag:blogger.com,1999:blog-3093890173712337163.post-66124592415133302522009-01-05T20:12:00.002+01:002009-01-05T20:19:58.368+01:00No fuss about causality<p>Throughout history and up into modern days, a big fuss has been made
among philosophers about defining and dealing with causality. For a nice
overview, see <a href="http://bayes.cs.ucla.edu/LECTURE/lecture_sec1.htm">these lecture slides</a>, which illustrate the
troubled history of the concept. In recent times, formal approaches have
been developed to connect causality to probabilistic/statistical
reasoning (<a href="http://www.stat.harvard.edu/faculty_page.php?page=rubin.html">Rubin</a>) or to do just the opposite, treating
causality as an extension supposedly completely out of scope of
probability theory (<a href="http://bayes.cs.ucla.edu/BOOK-2K/">Pearl</a>). It seems that the causality
debate still rages on, apparently now on the battlefield of notations.
For example, listen to Pearl's <a href="http://bayes.cs.ucla.edu/video/Hopkins_2008_Causal_Inference.html">recent lecture</a> in which he
quips that "mere mortals" not trained by Rubin cannot verify certain
expressions required within Rubin's framework. Pearl himself advocates a
graphical representation of causality (little wonder in light of his
past work). Even so, when asked about modeling just slightly
complicated scenarios (A causes B, but only given C), he grudgingly
admits that graphs do not directly expressing such constraints. Instead,
the constraint can be hidden within the probability distribution
associated with a graph.</p><p>Hearing all this, I wonder whether the award-winning philosopher
is not now in the business of shooting sparrows with cannons. I agree
with Pearl's assessment that given a set of structural equations
or a graphical model (like his electric circuit example), all causal
and counterfactual questions can be readily answered by simply running
the model (simulation). I'm puzzled why Pearl does not go one step
further and point out that nowadays (and since 50+ years) we have
very elaborate and wildly popular tools for expressing causal models
and the equipment for running them. They are imperative programming
languages and computers, of course. Every program written in
an imperative language <em>is</em> an intricate causal model, in which
expressing constraints of the sort mentioned above comes effortlessly
and the notion of time (so central to all causal reasoning) is given
by the execution semantics.</p><p>For example:</p><pre>
if (c == C)
{
if (a == A)
{
b = B;
}
}
</pre><p>which is of course equivalent to</p><pre>
if (c == C && a == A) { b = B; }
</pre><p>which is of course equivalent to stating "A and C (combined) cause B".
Given such a model, we may call A and C separately "necessary causes" if we
so prefer. We may call "A and C" the "sufficient cause". Finally, given
a particular run and a different expression of the sort "A or C", we may
speak of the "actual" cause having been either "A" or "C" or both.
What I wish to say is that there are no doubts about causality given
a model in form of a computer program. It also makes obviously clear
how pointing to a single variable as "the" cause of something could
be incorrect. Finally, modeling runs of computer programs has been
a topic in computer science for decades, even if the researchers have
never bothered to use the word "causality" in this context.</p><p>Of course, computer programs are entirely deterministic and hardly
"statistical" beasts. However, who says that the "real-world" causality
is not or at least <em>may not be treated as such</em>? If you view
probability, as I do, as a means for modeling epistemic (that is,
modeler's own) uncertainty rather than some ontologic "stochastic
randomness" of nature, then you can apply it without hesitation to
deterministic computer programs, in circumstances where parts of the
state or code are unknown. For example, you could model an unknown
variable value as a probability distribution over possible values,
or you could model an unknown segment of code as a probability
distribution over possible segments. (If you can't even enumerate the
possibilities or if they appear "infinite", you are in trouble;
ask yourself whether and why you know so little and how you could
find out more.)</p><p>The challenge of science is, as Pearl rightly points out, that we
seldom <em>know</em> the causal model. That is, we either don't know
what program has (or may have) generated our observations, or the same
set of observations might have been equally been generated by many
different programs. In this latter case we have a uniform probability
distribution over programs. Our task then is to somehow infer the
program from the observations and from "causal assumptions" - data
and the prior. The "somehow" should be plausible reasoning according
to the rules of probability theory, and so we have a connection
(not of the sort contemplated by Pearl/Rubin).</p><p>The causal assumptions correspond to our estimate about which models
(programs) are possible at all, and which are consistent with other
models (programs) that we already deem as accurate and useful
representations of reality. Interventions before observation help
enormously by lowering probabilities for sets of programs not
compatible with the intervention+observation data.</p><p>For example, given the following set of observations:</p><pre>
a = 1, b = 0
a = 0, b = 0
a = 0, b = 1
a = 1, b = 0
a = 0, b = 1
a = 1, b = 0
a = 1, b = 0
</pre><p>we could just as well fit the following two causal models (and many
others):</p><pre>
if (a == 1) { b = 0; }
</pre><p>or</p><pre>
if (b == 1) { a = 0; }
</pre><p>However, if we perform a set of interventions of setting b = 1 and
observing a != 0, and another set of interventions of setting a = 1
and observing b == 0, the first model will stand the test while the
second one will become very implausible. However, we should be careful
to not proclaim it impossible, as there still could be hidden variables
not accounted for within the model contributing to the observed outcomes.
One day, we might find these factors and control for them and
setting b = 1 might then indeed begin causing a == 0. And so we see
that:</p><ul>
<li>causality, much like probability, is in the eye of the beholder</li>
<li>(incomplete) causal models may be treated as if they generated
data according to some probability distributions</li>
<li>causal models may be assigned probabilities</li>
</ul><p>That said, there is little reason to make a big fuss about finding
the "one true definition" of causality, the "one true notation" for
representing causal arguments, or "measurement methods" for
determining strength of "causal connections". We have no need for
big philosophy of causal reasoning, but great need for good,
sufficiently granular and computationally cheap causal models
that reliably deliver predictions about effects of actions to
their users.</p>jplnoreply@blogger.com0tag:blogger.com,1999:blog-3093890173712337163.post-24392237156385815482008-11-01T15:28:00.010+01:002008-11-01T16:05:06.046+01:00The intellectual dishonesty of "stochastics"<p>To describe inference problems using the language of stochastics does not necessarily yield poor results, but it seems inherently intellectually dishonest. To see what I mean, consider a typical language used in stochastics: "Given that we are dealing with a random process of the sort X, we can infer that Y is true ... [a valid argument follows]". The intellectual dishonesty is concealed in the "given that" introduction, as users of stochastics arguments hardly ever feel obliged to demonstrate that the premise is fulfilled. A particularly frequent example is the assumption of normally distributed errors.</p>
<p>A satisfactory demonstration of the assumptions' validity would usually require many empirical measurements, which might be outright impossible (e.g. to determine an error of an instrument you need an even more accurate instrument, which might be unavailable), too expensive, or simply out-of-reach of the person who is making the stochastic argument. If confronted with that inconvenient fact, several lame tactics are possible:</p>
<ul>
<li>refer to the literature (claim that the actual measurements have been made already.. by someone else.. sometime);</li>
<li>refer to others behaving the same way (if everybody does it, then it must be right);</li>
<li>vaguely proclaim that we are dealing with idealized models, so we're ok after all;</li>
<li>if the normal distribution is questioned, refer to its natural occurrence and the central limit theorem - that is, claim that it is very likely to be the right distribution, after all.</li>
</ul>
<p>Why are these tactics lame? Because they are simply attempts to conceal, at all cost, the speaker's <span style="font-style:italic;">lack of information</span>; socially conditioned grasps to retain authority about a subject. However, we can and ought to be smarter than that. Consider this:
<ol>
<li>Knowing is generally preferred to not knowing.</li>
<li>Knowing that you don't know is generally preferred to pretending (to yourself and others) that you do. Even if it makes you feel good and shuts up critics.</li>
</ol>
<p>It turns out that the "stochastic" statements about the <span style="font-style:italic;">random</span> process can be easily translated into statements about the speaker's (and perhaps everyone else's!) lack of information about the exact characteristics of the <span style="font-style:italic;">deterministic</span> process. In other words, we <span style="font-style:italic;">assume</span> a particular distribution because we <span style="font-style:italic;">don't know</span> any better one - the alternatives are even worse <span style="font-style:italic;">given what we do know</span>. If you think about it for a second, it is quite a different (and better) approach than lying to yourself about what you know, for the very simple reason that the former way of thinking invites the possibility of learning more while the latter way of thinking has the precisely opposite effect.</p>
<p>It is very possible that experienced users of stochastics do realize all of the above, and so I am belaboring a trivial point. If so, it remains somewhat puzzling as to why their language does not mirror their thinking. A case of professional jargon abuse, maybe? Needless to say, this sort of language is definitely misleading to the uninitiated student of probability theory/statistics. The sooner you see through it, the better.</p>jplnoreply@blogger.com0tag:blogger.com,1999:blog-3093890173712337163.post-82834511835569933572008-11-01T12:10:00.014+01:002008-11-01T13:58:19.632+01:00On zero probability in the continuous variable case<p>Here is a quote from <a href="http://www.cs.unc.edu/~tracker/media/pdf/SIGGRAPH2001_CoursePack_08.pdf">a SIGGRAPH course by Welch and Bishop</a>:</p>
<blockquote>In the case of continuous random variables, the probability of any <span style="font-style:italic;">single</span> discrete event A is in fact 0.</blockquote>
<p>The same quote could be taken from many introductory texts on probability theory. It seemed absurd to me the first time when I read it. Back then, I got over it, attributing the feeling to my own inexperience. Well, after some years and I dare say improved understanding, I know that it <span style="font-style:italic;">is</span> in fact an absurd - or at least an uncomfortably sloppy - statement. Moreover, I can explain why and get rid of the confusion.</p>
<p>There are two main reasons for the intuitively perceived absurdity:</p>
<ul>
<li>Zero probability is synonymous with "impossible event". If the quoted statement was true, it would follow that, regardless what value of the random variable you choose, it is impossible (and I really mean <span style="font-style:italic;">any</span> value). Yet we know from experience, which our model is supposed to reflect, that the random variable does assume <span style="font-style:italic;">some</span> value in reality.</li>
<li>The positive probability of a value falling in a given interval arises from summing probabilities (integration) of all discrete values within that interval. However, adding together zeros - even in an infinite loop - yields zero.</li>
</ul>
<p>Of course, one could ask: if the probability P(A=x) is not 0 in the continuous case, then how big is this probability? The simple answer to that is: <span style="font-style:italic;">there is no continuous case</span>, it is a figment of a mathematician's imagination, a model primarily intended to ease calculations, rather than a representation of reality. The zero probability "exists" in the same sense as a mathematical point "exists". On the other hand, when we talk about "possible" and "impossible" events, we talk about [our perceptions of] reality. We'd also like the connection to reality to remain intact when we use the notion of probability, continuous or not. Of course, if the continuous case is discretized (and you can choose to do it using as many discrete events as you desire), the "paradox" of possible zero-probability events is resolved at once.</p>
<p>Where do the idea and the bold assertions about P(A=x) = 0 come from, then? They are but a sloppy description of the limiting <span style="font-style:italic">process</span>, of increasing the number of events without bounds. That is, a way to say that "the more equiprobable events we have, the smaller the per-event probability". It is correct to say that we <span style="font-style:italic">approach</span> zero probability, which is quite a different thing from saying that we (ever) reach this value. In all practical thinking, we may safely ignore infinite processes and infinite "things" a mathematician is so fond of, or better yet, accept them as a <span style="font-style:italic">convenient approximation of our discrete reality</span> to which our actual reasoning applies.</p>jplnoreply@blogger.com3tag:blogger.com,1999:blog-3093890173712337163.post-83048612317230387652008-08-24T16:45:00.010+02:002008-11-01T12:38:58.824+01:00Introduction to Probability Theory, Part 4<div style="text-align: left"><a href="/2008/08/introduction-to-probability-theory-part_6153.html">Continued from Part 3...</a></div>
<p>Having discussed how different persons might assign different probabilities to the same proposition and for the time being disposed of the notion of the "one true probability", let's turn to another intriguing question: how does one person know whether her assigned probability is correct - that it reflects her own background information? Obviously, if what you know remains unchanged, so should the probability assignments that you make for any proposition on that basis. In other words, it would not be sensible to assign at your whim two different probabilities to the same proposition, unless you have in the meanwhile learned something new which is somehow related to that proposition. However, if you agree with me that the probability assignments are stable given some state of knowledge, then still the question remains: which probability assignment among the infinite number of assignments between 0 and 1 is appropriate - and what does "appropriate" even mean precisely?</p>
<p>The answer to the second question given by probability theory is intuitive and satisfactory. An "appropriate" or "correct" probability assignment is such one that is consistent with all the other probability assignments you might make. That is, you cannot have your cake and eat it too: because all considered propositions are either true or false, and because they are interwoven (their meanings are related to each other), there is some risk that you might come up with an internally contradictory probability assignment - based on which you'd have to conclude that some proposition is both true <i>and</i> false. The "correct" assignment, on the other hand, does not evoke any such absurd conclusions.</p>
<p>For example, you cannot and would not at the same time believe that I am both younger and older than any given age; if you felt 75% sure that I'm older, you'd also feel 25% sure that I'm younger and vice versa. However, if you were to assign probabilities to some related and indirect propositions instead, from which my age could be derived (say, propositions about myself having witnessed certain historical events during my lifetime, propositions about my friends' and parents' ages etc.), it could happen by accident that your combined probability assignment would imply that you <i>do</i> believe in that absurdity. You would then have to reject such a probability assignment and find out at which particular subproposition it went wrong (including the possibility of going wrong many times).</p>
<p>As an analogy, it helps to consider "financial arithmetics". If you were an accountant tasked with summing percentual fractions representing parts of a whole amount and arrived at a sum which was either greater or less than 100%, you'd know that you must have made a mistake somewhere along the way. Note, however, that it is a rather weak criterion of correctness: for not all sums that end up with 100% contain all the right components. Indeed, you can produce an infinite number of artificial sums that all end up with 100% by tweaking the individual components relatively to each other. So what you'd need to become more certain that the arithmetical calculation reflects reality would be some additional means of checking consistency, such as partial sums. Depending on your level of paranoia, you could introduce more and more partial sums ad infinitum. The point is, they would all have to be consistent in order for you to be satisfied that the calculation - analogous to a probability assignment - was true. Just one slight deviation from their expected relationship would mean that an error slipped in.</p>
<p>Now that we have introduced self-consistency as a means of checking whether a given probability assignment is <i>the</i> correct one, have we also answered the first question - how do we find this correct assignment? Yes and no. In principle, we could "simply" write down hundreds of thousands of probability assignments and then go through each one and note the inconsistencies it contains, and in the end accept the probability assignment which has the least amount of inconsistency. Obviously, given the infinite number of different possible assignments, this would be a formidable task (for any human and for any machine), and also nothing like what we are used to in solving real problems. This would be comparable to an accountant generating randomly hundreds of thousands of balance statements and then going through the heap to check which of them reflects the company's finances. Fortunately, it's not how accountants work and not a sensible use of probability theory either. What we need instead is a kind of reliable, mechanical rules that allow us to <i>construct</i> internally consistent assignments, as long as we stick to them, much like rules of artihmetics don't ever let you down. Such rules indeed do exist, and they form the very core of probability theory, or as <a href="http://en.wikipedia.org/wiki/Richard_Threlkeld_Cox">R. T. Cox</a> called them "an algebra of probable inference".</p>
<div style="text-align: right">To be continued...</div>jplnoreply@blogger.com3tag:blogger.com,1999:blog-3093890173712337163.post-60906692814485206262008-08-17T13:28:00.011+02:002008-11-01T12:39:23.375+01:00Introduction to Probability Theory, Part 3<div style="text-align: left"><a href="/2008/08/introduction-to-probability-theory-part_17.html">Continued from Part 2...</a></div>
<h4>All probabilities are conditional</h4>
<p>Ok, now we come to an interesting point. If probability is attached to propositions, and propositions are about objective things that can be true or false, is it right to say that "the objective probability of proposition X is so and so"? What if you and me disagree in our probability assignment about the same proposition - I feel that this page is top-notch and you feel that it is mediocre, without either of us knowing the actual public rating?</p>
<p>Regardless of what you might have been taught about "events" having some inherent "probabilities" that "we" are trying to calculate, the above example of disagreement about probability of a proposition is a perfectly normal situation. We all know that it happens all the time. Just turn on your TV and look at some programme with folks arguing like crazy about different issues. Obviously, independently of how concrete a proposition is, people may disagree on its probability - it is a measure of their degree of confidence, not your degree of confidence after all! Now, the next question naturally is: why can different people have different degrees of confidence in truth of the same thing?</p>
<p>The answer is their different information context. Whether or not you hold some proposition for likely strongly depends on what other propositions you believe in. In a way, all the different propositions are related in our heads, and we are usually quite ready to change our opinion on one proposition after learning something about another. For example, you might be somewhat certain that I'm a native English speaker after reading my text, but if you could throw a brief glance at my passport, it would change your assessment. If you saw an entry "American" under nationality in it, you'd become (almost) certain about the truth of that proposition. On the other hand, if you saw some other nationality, you would become almost certain that the proposition is false. Now, someone else might not have had the same opportunity of looking at my passport and therefore assign a different probability.</p>
<p>Generally, what probability we assign to some proposition depends on what we already know about some other propositions. In mathematical speak, we refer to <i>conditional</i> probability - the probability assigned to X <i>given that we already know</i> that Y is true. In fact, for all practical purposes, all probabilities are conditional. Instead of saying that two different people assign <i>a different probability</i> to the same proposition, we may just as well say that they are just giving us <i>two different probabilities</i> concerning this proposition. The first person is giving us the probability conditional on A (her state of information), the other person is giving the probability conditional on B (her different state of information). There is nothing strange or disturbing about the discrepancy in numbers that arises then. On the contrary, if we could bring the two persons to believe exactly the same set of the "remaining" relevant propositions, they would agree perfectly on the probability assigned to the one uncertain proposition because they would effectively think exactly the same and thus lack any reason to disagree. This convergence of opinions is not easy to achieve, but it is not as far-fetched as it might seem. It can and routinely does happen during practical investigations and in science.</p>
<p>The important point to take from this part is: probabilities are assigned to propositions, but they are not properties of the propositions alone. Instead, a probability is a property of the proposition in question together with all other propositions held to be true by the person who assigned the probability. In fact, we can forget about the person altogether and just represent her by the totality of all propositions she knows to be true.</p>
<div style="text-align: right"><a href="/2008/08/introduction-to-probability-theory-part_24.html">Continued in Part 4...</a></div>jplnoreply@blogger.com0tag:blogger.com,1999:blog-3093890173712337163.post-73725957905774443092008-08-17T13:27:00.007+02:002008-11-01T12:39:41.870+01:00Introduction to Probability Theory, Part 2<div style="text-align: left"><a href="/2008/08/introduction-to-probability-theory-part.html">Continued from Part 1...</a></div>
<h4>Propositions - the carriers of probability</h4>
<p>A probability is a number between 0 and 1 which expresses someone's degree of confidence in the truth of some proposition. A proposition is simply a statement of fact like "This page is over 1000 words long". In reality, every such statement can be either true or false. You would only talk about a "probability" if you were unsure which of the both (true or false) was the case. However, what you always <i>do</i> know up-front, is that the proposition is either false or true, but not both, and not something in between either.</p>
<p>What about statements like "This page is entertaining and informative"? How can it <i>really</i> be either true or false? Doesn't it just depend on who is judging it? Well, it does, until you define some way of measuring "entertaining and informative" which does not involve a single person's tastes. But let's say that we agreed on some voting scheme in which all potential evaluators would participate in. Then the "entertaining and informative" would no longer be up to your or my opinion only - it would become more of an objective property of this page. And yes, without having seen the actual ratings, you could be unsure about this property (how everyone has rated it). So you could assign different probabilities to all the possible "entertaining and informative" ratings it might have. In other words, you would then have propositions like "The entertainment rating is 0/10" or "The informativeness rating is 9/10", and of course each of them could be true or false, but not at the same time. You might feel more confident that this page has good ratings than bad ratings and express this by numbers using your probability assignment when asked about it.</p>
<p>The thing I'd like you to consider is that when we are discussing probabilities, we are talking about our degree of uncertainty about some concrete propositions. If the propositions appear fuzzy and their truth seems undecidable in principle, then we have to become more specific first and clarify what we mean before we can even start talking about and asking questions about probabilities. Obviously, if we don't even know what our questions are about, we cannot expect any definite and useful answers.</p>
<p>Incidentally, propositions like "a die throw result is 4" or "a coin throw outcome is heads" are very clear. Pretty much everyone agrees on what they mean and could check their truth just like anyone else. Now you see one reason why these sorts of propositions are so eagerly used in classroom introductions to probability. Still, there are many other propositions that just as concrete and a lot more fun to think about than these trivial examples.</p>
<p>Finally, note that the very reason why we talk about probabilities of propositions is that, although they are verifiable in principle (their truth could be checked - and we know how), they may be quite hard to verify in practice. Maybe the proposition is about something that has not happened yet; it could also just as well be about some past event. If we were able to directly find out whether it's true or false, we would of course just do it and we wouldn't waste time talking about its "probability". Probability is for situations where we have to <i>infer</i> the truth of a proposition from whatever indirect clues we <i>can</i> collect without doing miracles or spending a fortune.</p>
<div style="text-align: right"><a href="/2008/08/introduction-to-probability-theory-part_6153.html">Continued in Part 3...</a></div>jplnoreply@blogger.com0tag:blogger.com,1999:blog-3093890173712337163.post-4888029289414097432008-08-17T13:15:00.018+02:002008-11-01T12:39:57.316+01:00Introduction to Probability Theory, Part 1<p style="font-style:italic">In this series of tutorial-style articles I recap what I have learned about probability theory from studying the work of E. T. Jaynes (available online <a href="http://omega.math.albany.edu:8008/JaynesBookPdf.html">here (book)</a> and <a href="http://bayes.wustl.edu/etj/science.pdf.html">here (lectures)</a>), which I recommend - with some reservations. The introductory parts are easy to read and enjoy. However, the later chapters are dominated by references to physics and mathematical formulae whose explanations are rather too brief for my taste. Jaynes seemed to write for students of physics at graduate level (even though I believe it was not his intention). I feel that his ideas are so intriguing and general that they deserve a broader audience. The goal of these posts is to introduce the most important concepts with fewer assumptions about the reader's level of mathematical sophistication; and to verify my own understanding in the process.</p>
<h4>It's not just about coins and dice!</h4>
<p>If you are like most people, you were introduced to the concept of probability at school with examples such as throwing dice, flipping coins, selecting cards from a deck, spinning lottery wheels, pulling colored balls from urns and other such. You will find plenty of such examples in various tutorials on the web, too. While there is nothing wrong about them in general, they can leave the impression that this is what "probability theory" is <i>all</i> about. A rather boring application of basic arithmetics to some idealized useless "random experiments" that noone cares about in real life. That is, unless they are after good marks for mechanical answers to silly questions like "what is the probability of scoring more than 2 but fewer than 8 with two dice". It appears just about as exciting and thought-provoking as solving quadratic equations for sports.</p>
<p>What they usually don't tell you is that probability theory describes what you - and everyone else - have been doing for your whole life with more or less success, without even realizing. All kinds of reasoning and decision making depend on probabilities that people assign to various propositions:</p>
<img src="http://upload.wikimedia.org/wikipedia/en/e/e8/Escher_Waterfall.jpg" style="float:left; width: 200px; margin-right: 20px" />
<ul>
<li>Whenever you look at something (like Escher's drawing of waterfall on the left), you unconsciously figure out the probabilities of seeing different scenes. You make up your mind what the scene is about and whether it is "real" or not;</li>
<li>Before you cross a street, you unconsciously figure out the probability of being hit by a car and getting to the other side safely;</li>
<li>Whenever you decide to buy something, you figure out the probability of getting good value for your money;</li>
<li>Detectives figure out who dun it based on probabilities of finding particular criminal evidence;</li>
<li>Criminals figure out how to reduce the probability of getting caught;</li>
<li>Scientists figure out which explanation is more probable than others for an observed phenomenon;</li>
<li>Businessmen figure out which deals are more likely to bring them profits;</li>
<li>Politicians figure out which public statements are more likely to bring them voters;</li><li>and so on, and so forth.</li>
</ul>
<p>The really important thing to notice here is that we are almost never 100% certain about anything. We can be rather sure or rather doubtful about different things, but we can hardly ever <i>honestly</i> proclaim: "I know it's a sure thing" or "I know it's completely impossible" - except perhaps when trivial and uninteresting stuff is concerned. To put it in a slightly different way, whenever we need to think and make choices, there is always some uncertainty involved.</p>
<p>Real applied probability theory is about systematically improving our everyday thinking and decisions:</p>
<ul>
<li>It's about drawing the best conclusions from whatever we already know and understand;</li>
<li>It's about not getting fooled and confused;</li>
<li>It's also about knowing how to act to become more knowledgeable about stuff that matters.</li>
</ul>
<p>The concept of probability is quite difficult to grasp, though mathematically very simple. A tiny little part of it is about throwing dice and shaking urns in the classroom.</p>
<div style="text-align: right"><a href="/2008/08/introduction-to-probability-theory-part_17.html">Continued in Part 2...</a></div>jplnoreply@blogger.com0