It has been asserted by one *gwern branwen* that data from unknown and unverified sources can be used to analyze penile lengths. The proposer of this view states:

“My own inclination is towards Bayesian logics, which in inductive matters (which this is) enforce use of all available background information – which includes the low-quality data you deprecate.”

So let’s examine inductive Bayesian logic. Bayesian logic proposes that one can obtain a better probability value if one is given more data. Thus the probability that event X has occurred is less accurate than the probability that X has occurred *given* that event Y has occurred where Y is in some way related to X. This assumes that one knows that Y is related to X. Bayesian logic thus uses the individual’s prior experience of Y to evaluate X. If the individual (I) did not experience Y, to make the Bayesian inference, he would need to be told that Y is related to X. This further assumes that who is telling him that Y is related to X is correct.

In simplistic mathematical terms: P(X) != P(X|Y), Y =kX where K is some proportionality constant.

One example is of acoustical similar questions: “Have you got four candles?” and “Have you got fork handles?” The store assistant based on prior experience will induce that the customer means ‘four candles’ because that it what previous customers have asked for more times than they have asked for ‘fork handles’. Thus, for the assistant,

P(four candles|candles & handles) > P(fork handles|candles & handles)

This does not however, tell us about the accuracy of the Bayesian inference. The customer might have indeed asked for handles while the Bayesian inference assistant would assume they meant candles. The data published by the site World Penis Average Size Studies Database (WPASSD) includes studies and sources which no one have heard about. While some of the sources do exist, the source data and the database data do not match. For instance, there is an inflation of 1.3 inches for Italians and a deflation of 0.4 inches for the English. If one were to use inductive Bayesian logic, data from the WPASSD would have to be used in conjunction with prior data. Which prior data is this? Let’s apply Bayesian logic broadly.

Let (As) = Asian male penile length, (Af) = African male penile length and (Eu) = European male penile length.

gwern branwen asserts that As < Af, As < Eu and Eu < Af. He claims that the data is sparse:

“There are few good sources, that is true. But the burden of proof is on anyone who wants to claim equality – heights, weights, obesity rates, all these differ and sometimes remarkably so. Why would various anatomical parts be the exception?”

When told there are some data from the medical literature, he strangely asks that I list them for him. So how exactly did he perform Bayesian logical inference on this issue? Here is how he most likely did this with the WPASSD’s low quality data.

=> P[(As|WAPASSD) < (Eu|WAPSSD) < (Af|WAPSSD)] > 0.5, thus Asians have the shortest penile lengths of any ethnic group.

There is no source analysis, no correlations and no data analysis (that some are supposedly self measured and others measured). Bayesian logic requires that you have an initial probability and then this is changed by the addition of verified data. His initial data is from J.P. Rushton.

In human penis size, the controversy is usually over African or African-American penis sizes compared to Caucasian penis sizes; but even sources criticizing Philippe Rushton’s review

Race Differences in Behaviour: A Review and Evolutionary Analysisciting many studies (eg. Masters and Johnson) and reviews finding that black > white > east Asian admit that white penis sizes are larger than east Asian (apparently collectively measured an inch average size difference in favor of whites). A miscellany of studies & sources compiled at the anonymousWorld Penis Average Size Studies Databaserank East Asian countries low on average length; if nothing else, it makes for some sardonicly hilarious economics papers likeMale Organ and Economic Growth: Does Size Matter?. Well, it doesn’t especially matter.

So his original data is from Rushton and has been verified by unknown “sources criticizing Philippe Rushton’s review.” One wonders which sources are these? Since neither the original or subsequent data are verified, Bayesian logic has not been performed. So why the claim to be using Bayes Theorem? Because it sound scientific and might scare away the scientifically uninitiated. Now an inference is something assumed to be true and is based on evidence. But since he does not know of any medical evidence his only evidence is that of the WAPSSD. He thus assumes that the WAPSSD is correct and uses this to pretend to be performing Bayesian analysis.

**Sadly, his site has now disabled public comments (and deleted my previous comments) but one can send him anonymous feedback**.

Update: A PDF of the article with comments can be found here