Today I thought I would blog about a paper on making high-tech start-ups more investment-ready that I recently had accepted at the Review of Economics and Statistics (co-authored with Ana Cusolito and Ernest Dautovic). I thought this might be of broader interest, even for those who don’t care at all about firms, for discussing the logistics of conducting an experiment in multiple countries, and for discussing how exploratory findings can still emerge through the refereeing process after a pre-analysis plan.
An investment-readiness experiment in 5 countries
The basic idea behind the experiment was that even when innovative start-ups and SMEs in developing and transition countries have good ideas, they may not have these ideas fine-tuned to the stage where they can attract outside funding. A lack of investment readiness shows up in a reluctance to consider surrendering partial ownership of the business, a lack of knowledge about the availability of different sources of external finance, a lack of understanding about the key factors investors are looking for, and presentational failings such as not being able to give a compelling pitch of the business idea.
We wanted to test whether investment readiness programs, which provide a mix of individualized training, mentoring, and coaching, are sufficient to make firms more investment-ready and get them to the point of receiving external investment. Our focus was on the Western Balkans region, where increasing innovation is seen as a key priority, and where efforts to increase the supply of equity finance were subject to concerns about whether there would be sufficient firms who were in a position to receive this investment.
We launched a call simultaneously in five countries in the region: Croatia, Kosovo, North Macedonia, Montenegro and Serbia. This call sought innovative firms seeking or considering venture financing. We ended up with a set of 346 firms, mostly involved in high-tech innovative industries such as cloud computing and big data, app development for a wide range of business and personal services, pharmaceutical products, etc. These were quite different from the types of firms I have seen before in typical developing country field experiments – for example, there was a firm developing virtual reality software that could be used in outdoor interactive missions, an architecture firm that developed an innovative luxury “boatel” that runs on an electric motor and can be used to stay on lakes, and a firm that had developed solar-powered benches for public spaces. Half the founders had post-graduate education.
These firms were divided into two groups:
· A Treatment Group of 174 firms were offered an intensive two-month program that included a structured online business assessment, individualized mentoring (averaging 11 hours per firm), weekend masterclasses, and pitch training. The cost was approximately $4,000 per firm.
· A Control group: 172 firms were offered an online course through the University of Texas that cost $153 per firm
Following the intervention, both treatment and control firms were invited to a pitch event in collaboration with the Belgrade Venture Forum. Panels of independent judges then scored the firms on different aspects of their investment readiness, and selected 54 finalists to go onto pitch to investors. We measure impacts of this program on scores in this competition, on subsequent media coverage over the next two years, and on firm outcomes over the next two years.
What did we find?
We registered the trial and set out a pre-analysis plan that aimed to look at both whether the investment readiness scores in the pitch competition improved, and whether this subsequently translated into a greater likelihood of getting equity investments.
· The program did improve investment readiness as scored by judges: the investment readiness scores were 0.3 standard deviations higher on average for firms in the treatment group, and they were more likely to be selected to proceed to pitch in front of investors.
· There was a positive, but statistically insignificant impact on going on to receive external financing over the next two years. Treated firms were 5 percentage points more likely to get this financing, with a 95% confidence interval of (-4.7 p.p., + 14.7 p.p.)
The key question was then what to make of this null result on getting equity? Our initial viewpoint was to see this as an issue of statistical power. I wrote a blog post on the funnel of attribution, noting the difficulty of measuring impacts on outcomes further along in the causal chain than on the first steps. In this specific case, one key reason for low power was that the control group was more successful getting external funding that we had initially anticipated: 24% of control firms succeeded in getting external financing. We show in the paper that the increase in investment readiness, multiplied by what the control group predicts is the relationship between investment readiness scores and equity outcomes, gives a predicted treatment effect close to what we find – but just that we are underpowered to detect an impact of this size.
An alternative viewpoint we encountered was to view the null result as saying that this type of program was not effective. We received this type of feedback from some referees and policy audiences, leading to questions about whether these types of programs should be supported in the future.
What we learned through the publication process and referee suggestions
Researchers often have lots of complaints about the publication process, but this is a case where reviewer comments helped add key insights to our findings. The paper was first rejected from a couple of finance journals (the main complaints being whether it was of broad enough interest and “the statistical weakness of the results”). We then sent it to ReStat, and were fortunate to get a request to revise the paper. A key query raised by the referees was to ask whether there was a third reason for the null result – that it was the average of large impacts for some subgroup and much smaller impacts for other types of firms. In particular, a referee asked whether the program worked better for smaller firms. Sure enough, this is what we found – the figure below shows result from rolling regressions which use 30% of the sample at a time. Smaller firms are less investment ready and find it harder to get funding without the program: 14 percent of control firms with 1 to 3 workers had received external financing at our 2 year follow-up, versus 35 percent of those with 4 or more workers (p=0.002). The program then helps these smaller firms improve, while not helping the larger firms.
Ex-post heterogeneity: the program only worked for smaller firms
We then explore this further using endogenous stratification. This uses the baseline covariates and control group to predict (using split sampling or leave-one-out estimators) the likelihood a firm would get financing anyway without our intervention. We then find that the investment readiness program is estimated to have a positive and significant impact of 12.4-14.3 percentage points on receiving external investment for those firms who otherwise would be in the bottom half of firms in our sample in terms of likelihood of receiving an investment – and a small and insignificant negative point estimate for those firms that were highly likely to get financing anyway.
Reflections, implications, and general points
These new findings of treatment heterogeneity suggest the program really can be effective, and that the key is targeting it to the right types of firms. This is definitely a more positive message for policy. The question is how we should view these non-pre-specified results. We pre-specified only looking at heterogeneity according to baseline investment readiness: the point estimates suggested larger effects for firms that were less investment-ready to begin with, but the interaction wasn’t significant. We did not then do further exploratory analysis, wary of concerns of data mining for heterogeneity. But because this was a dimension suggested by a referee, we view it as a bit more credible than if we had just put this in by ourselves (while still noting in the paper that it was ex-post). This highlights a possible advantage of the publication process – offering some structure on the exploratory part of understanding results – so long as referees don’t ask about heterogeneity in many different dimensions. (Perhaps we need a rule that a referee can at most ask the authors to explore heterogeneity on 1 or 2 variables only).
A few other things I learned from this project:
· The first is thinking more about pre-specifying this endogenous stratification approach to heterogeneity when key outcomes are binary – for a number of outcomes like finding employment, starting a business, getting a loan, etc. there will be some individuals or firms that are likely to achieve this outcome regardless of our intervention, and the binary outcome therefore gives them no room to improve. Looking at heterogeneity by the likelihood of this therefore will be sensible in many settings.
· Implementing an experiment in multiple countries is logistically hard. The program and experiment were run in five countries simultaneously. This had the key advantage of boosting our sample size, but raised several important logistics challenges. It required more coordination with multiple agencies. More importantly, it hampered participation – we had rotating masterclass weekend events in different countries, but very few entrepreneurs would travel to a different country to attend. Pitching to judges occurred alongside the Belgrade Venture Forum, meaning firm owners had to all travel to this city to pitch their ideas – with travel times of 4-7 hours from the other countries. As a result, only 61% of firms attended the semi-finals. In the future I would push more for more decentralized events to reduce travel issues.
· Thinking creatively about other data sources: it was a real challenge to get firms to respond to surveys, and required an intensive effort to get to 85% responding after two years. We thought hard about what other information we could gather on whether these firms were gaining market traction and customer attention. One source we used was to contract a media intelligence specialist firm which tracks more than 250,000 global news sources in 190 countries in 25 languages (including Serbo-Croatian and Albanian) – using this we could show the treated firms were more likely to get mentioned in the media.
· Plan early for what you can say on those not in your program: a question from referees on both this paper and another paper I’m currently revising is how the applicants to the program compare to other firms in the country. Often data on these non-applicants is really scarce, so planning ahead for being able to say something on this would be useful.