I like Michael's model. I wouldn't be bothered if both sets of results appeared in the same paper so long as there was a clear distinction between the paper A analysis and the paper B analysis. I think his point about using the most influential measure gets at something really important: pre analysis plans can be viewed as a mechanism to commit not only the researcher but also one's audience. It's a contract. The researcher is committed to do things a certain way, and the audience is committed to accept the result as a valid judgment on a hypothesis. The pre analysis plan should be workshopped with this kind of mutual commitment in mind---to carry the analogy, the contract requires input and negotiation from both parties. This should result in study designs and analysis plans that assess important hypotheses in ways convincing, ex ante, to the relevant community of researchers. Ideally the contract includes a conditional commitment to publish; this may be hard to do formally, so again this is why it is important to workshop the pre analysis plan---to gain some kind of informal commitment. When a paper comes out having been preceded by such a contract, critiques such as Gelman's would have to contend with the fact that the study was executed according to standards considered compelling ex ante in the discipline.
On index construction, I have some notes in my Quant Field Methods course on this issue. PCA and things like mean effects (or, what I prefer, inverse covariance weighted averages) have different logics. Here is a simple example. Suppose you have three variables: math standardized test score, math grades, and verbal standardized test score. For the example, suppose the two math variables are highly correlated but the verbal score exhibits weak correlation with either of the math scores. If you took, say, the inverse covariance weighted average of the standardized scores, you'd get a score that gives about 50% weight to the verbal scores and then 25% weight to each of the math variables. This might be considered a measure of "aptitude", like the SAT. It is an optimal combination of three variables that are considered ex ante to contribute to a common concept (aptitude), even if they are not all strongly correlated with each other (aptitude consists of different things). If you used PCA, you'd recover two dimensions, one that consists solely of math contributions, and another that consists solely of the verbal contribution. That buys you back a degree of freedom, which is good, but that's about all you've achieved. Whether it is the appropriate way to use the variables one way or another depends on whether you have an ex ante reason to think there us a common construct to which all three variables contribute or not.