Introduction

In funds management, socially responsible investment (SRI) is the practice of incorporating social goals into the investment decision-making process. These social goals are based upon environmental issues, human rights, community involvement and labor relations. From the SRI fund manager’s perspective these social goals translate to investment prohibitions for industries considered antisocial, such as the alcohol, tobacco, gambling and defense industries.

In recent years, the popularity of SRI funds has increased due to increasing awareness of antisocial investments by investors, companies, governments, activists and the media. In the USA, the Social Investment Forum (2007) estimates that SRI covers over 10 % ($2.71 trillion out of the $25.1 trillion) of all assets under management.

A key question for SRI investors is, can SRI fund managers demonstrate investment skill? Traditional economic thinking states that the imposition of noneconomic values by the trustees of managed funds is inappropriate and that ‘the social responsibility of business is to increase its profit’ (Friedman 1970). In theory, due to SRI constraints eliminating entire ‘sin’ industries from the investment set, a fully diversified portfolio is not possible, leading to SRI underperformance compared with both the broader market and unconstrained fund managers (Humphrey and Lee 2011). Even worse for SRI, evidence suggests that ‘sin’ stock outperform both statistically and economically (Fabozzi et al. 2008; Hong and Kacpercyzk 2009),Footnote 1 and therefore their removal may directly contribute to the underperformance of SRI.

Here, we are not primarily interested in comparing SRI with non-SRI performance. Instead, if we accept that substantial funds will be invested in SRI irrespective of any issues and/or costs, then our focus should be on whether evidence exists for SRI investment skill in isolation. We want to know if SRI managers have skill once the investment constraints under which they operate are properly considered. Correctly accounting for SRI constraints is vital for the accurate detection of skill. After all estimates of skill vary by performance benchmark (Statman and Glushkov 2009). Typically, the ‘alpha’ from a standard asset-pricing model is a proxy for skill. However, standard asset pricing models assume that no investment constraints exist. In the SRI context, this is patently false and can lead to biased alpha estimates and false conclusions about the existence of skill.

To correctly measure the performance of SRI managers, we introduce a novel three-factor Fama–French asset-pricing model with the aim of estimating alpha, and hence investment skill, without bias. In a similar way to, who removed banking stock from their investment set, we remove those stocks in SRI-prohibited industries from the investment set and recreate the Fama–French ‘size’ and ‘value’ portfolios. We then use these portfolios to generate our new SRI Fama–French model and re-estimate an SRI-specific alpha in the belief that in this way we can correctly measure manager skill without bias.

We argue that the SRI investment constraints that lead to the removal of entire industries from the investment set induce systemic changes in the risk factors driving SRI returns, and that therefore new risk factors must be re-estimated. We should not overlook these SRI investment constraints and naively re-estimate new loadings on the standard risk factors as the SRI constraints may affect risk factors in complex ways. Instead, we estimate both new risk factors and new loadings on them and then measure skill.

As a result, when we use our SRI constrained Fama–French model, approximately 20 % of our sample has significant alpha in contrast to the standard Fama–French model, which provides no evidence of skill for any SRI fund. For those funds that exhibit skill alpha appears to be economically significant too, suggesting (on average) about 4 % per annum outperformance.

These results are important for the SRI mutual fund industry. The impact of excluding the ‘sin’ stocks and their large returns (Fabozzi et al. 2008; Hong and Kacpercyzk 2009) appears to be very important when estimating skill. We show that once the investment constraints under which SRI fund managers are considered, evidence of SRI manager skill is apparent. These results are robust to changes in time period (see Appendix 1).

Discussion and Literature Review

Much of the current SRI research compares the performance of SRI funds with non-SRI funds. It is argued that due to SRI constraints, which eliminate entire industries from the investment set, a fully diversified portfolio is not possible resulting in reduced returns and increased idiosyncratic risk, leading to underperformance compared with the broader market and unconstrained fund managers (Humphrey and Lee 2011).

Some studies find slightly superior returns for SRI funds (Abramson and Chung 2000; Diltz 1995; Mallin et al. 1995; Sauer 1997), others find no statistical difference (Bauer et al. 2005; Guerard 1997; Kurtz 1997), and still others find a slight underperformance of non-SRI funds (Goldreyer and Diltz 1999; Girard et al. 2007). Statman and Glushkov (2009) attribute results of no overall SRI out-performance to the notion that the advantage of the tilt towards high socially responsible stock is offset by the disadvantage of excluding high performing ‘sin’ stock.

Choosing an appropriate benchmark is vital for ascertaining if a manager has skill and can outperform a passive investment strategy. In the mutual fund context, Roll (1978) instigated research into alternative benchmarks after highlighting issues with the standard security market line benchmark. Alternative benchmarking has been extensively explored since, with the mutual fund literature populated with papers examining the impact of alternative benchmarks on mutual fund performance. This work shows that when appropriate alternative benchmarks are used, often-different conclusions about manager skill are reached. For example, Carhart (1997) finds that mutual fund persistence (i.e. ‘hot hands’) is explained by augmenting the Fama–French model with an additional momentum factor and not by ‘genuine’ skill. When using characteristic-based benchmarks Daniel et al. (1997) find that aggressive growth funds can exhibit stock selection skill. Hartzell et al. (2010) augment single REIT benchmark models with multi-benchmark models, and thus reduce or eliminate evidence of manager skill. In summary, Chan et al. (2006) state that mutual fund benchmarking practice often extends the standard CAPM benchmark by incorporating size and value risk factors, and that whilst (p. 37) ‘…it seems that the methods should all deliver more or less the same assessment about the level of manager performance. Our analysis…suggests that this is not the case.’ Indeed, as we shall see, our work finds that using the standard risk factors does not lead to evidence of skill amongst SRI managers; however, once ‘sin’ stocks are eliminated in the construction of risk factors we do find evidence of skill.

Here, we are not primarily interested in comparing SRI with non-SRI performance. Instead, our focus is on whether evidence exists for SRI investment skill in isolation. We want to know if SRI managers have skill once the investment constraints under which they operate are properly considered. Correctly accounting for SRI constraints is vital for the accurate detection of skill (Statman and Glushkov 2009).

We draw upon three key studies. The first, by Barnett and Salomon (2006), examines US mutual funds from 1972 to 2000 by regressing risk-adjusted returns against a ‘screening intensity’ variable equal to the number of screens employed by the fund. The major contribution of this study is the authors’ finding of a positive curvilinear relation between social and financial performance (paradoxically, they find no evidence of a linear association between the number of screens and fund performance). However, the study and its results are limited by the use of the capital asset pricing model (CAPM) to measure risk-adjusted returns, a model that has been criticized by subsequent research (Fama and French 1996). The second study, by Lee et al. (2010), is similar to the first, with the benefit of using the more modern Carhart (1997) model (an extension of the Fama–French model) to measure risk-adjusted returns. Using this modern asset-pricing model, the authors find little evidence of a relation between screening intensity and risk-adjusted returns, in contrast to Barnett and Salomon (2006). The notion of screening intensity as an effective proxy for the effect of SRI screens is also questioned by Renneboog et al. (2008) who provide evidence of a negative relationship between performance and the number of social screens, but not the number of sin, ethical or environmental screens. The third study, by Statman and Glushkov (2009), is in a similar spirit to our own. It focuses on measuring performance on SRI indexes rather than SRI mutual funds, and applies the Fama and French model to different portfolios, one of these being a long/short of Accepted minus Shunned stocks. (We refer to shunned stocks as ‘sin’ stocks; accepted stocks are all other stocks.) Importantly, the authors understand that screening (which accounts for 68 % of the money in SRI mutual funds in 2005) is the most prevalent form of SRI and acknowledge that industry classification might affect measures of performance beyond the standard Fama–French factors.

Our contribution is to introduce, similar to Statman and Glushkov (2009), a variation of the Fama–French model in an attempt to integrate the effect of negative screens. Furthermore, we eliminate SRI-prohibited stock from the investment set and re-estimate the Fama–French (1993) model. We believe that our novel approach incorporates the effects of negative screens in a more elegant and natural manner than Barnett and Salomon (2006) and does so in a modern asset pricing setting like Lee et al. (2010). We also explicitly account for industry prohibitions unlike Statman and Glushkov (2009).Footnote 2 Furthermore, as we shall show, our approach clearly leads to finding evidence of SRI managerial skill.

Correctly accounting for industry effects is important when estimating skill. The ambiguous performance of screening intensity in Barnett and Salomon (2006) and Lee et al. (2010) is no surprise since we believe that it is a pragmatic but weak proxy for the effect of industry-based SRI stock prohibitions. Its flaws are a consequence of the inherent difficulty in quantifying the impact of the various SRI screens on portfolio performance. Screening intensity simply counts each screen employed by an SRI fund. Whilst this has the advantage that nebulous screens such as governance, social, and environmental screens are considered, it does not consider how the overall returns and risk of an SRI portfolio are affected when SRI-prohibited stock are removed from the investment universe. Ultimately, we cannot expect an imprecise measure of SRI screening such as screening intensity to be able to distinguish between the subtle performance variations among various types of SRI funds that employ slightly different screens.

Therefore, we do not use screening intensity in our work and, instead, devise a novel approach to measure the impact of screens on SRI portfolios accurately. Rather than simply counting screens, we rectify a deficiency in the screening intensity method of Barnett and Salomon (2006) and automatically adjust the risk factors for the excluded market capitalization and correlations of those stocks that we exclude.

Out of necessity, we narrow our focus to those SRI funds that use product screens, the most prevalent screen type, to exclude the stocks of ‘sin’ industries, rather than attempt the more ambitious goal of considering every possible screen type employed by SRI funds including environmental, social and governance screens (see research method). The advantage of this focus is that it is straightforward to translate product screens directly into SRI-prohibited stocks and then remove these from the Fama–French investment universe.

If there are any systemic differences in size and value across the industries excluded by SRI fund managers, it follows that the aggregate risk factors faced by these managers must change. Indeed, the literature suggests that industry affiliation is associated with the value effect. Both inter- and intra-industry variation in value explain stock returns (Banko et al. 2004). Kothari et al. (1995) also find that there is a relationship between industry BE/ME and industry returns. Finally, Fama and French (1997) show that cross industry variation in risk factor loadings on their model represents a large amount of the return premium associated with high value firms. So, it appears that industry classification can explain a portion of the value premium and hence expected returns. In turn, this means that the removal of SRI-prohibited industries from the Fama–French investment universe results in a change to the risk exposures faced by SRI managers. We should not overlook these changes and naively re-estimate new loadings on the standard risk factors for each manager.

Research Method and Data

Our methodology is pragmatically motivated. Of the four types of screens which an SRI fund manager may employ—environmental, social, governance and product—this study concentrates exclusively on those screens which are most prevalent: product screens. These screens do not permit SRI managers who use them to invest in stock from ‘sin’ industries, including: tobacco, alcohol, gambling and defence. We focus on product screens because they directly and clearly translate to prohibited stock unlike the three other screen types. This means we can focus on those SRI managers that use product screen and remove ‘sin’ stock which belong to ‘sin’ industries from their investment set without any ambiguity.

Of course, a limitation of our approach is that we ignore the effects of the other three screen types, unlike screening intensity. These screen types are intrinsically more difficult to measure and quantify than product screens. Fortunately of the 88 SRI equity mutual funds in our universe, all but one use product screens. Forty-one use negative screens and 46 use best of product screens. Therefore, by focusing exclusively on product screens we do not introduce significant bias to our analysis.

When using product screens, we must assign an industry to each stock. To proxy for industry, we use three common industry classification codes: the Standard Industrial Classification System (SICS), the North American Industry Classification System (NAICS), and the Global Industry Classification Standard (GICS). The SICS was developed in 1937 and uses a four-digit numeric code to classify companies. The NAICS was developed in 1997 to replace the SICS because the four-digit codes could no longer cope with developments in new services and, as such, the NAICS six-digit system was developed to provide more coverage and accuracy. Standard & Poor’s and Morgan Stanley Capital International developed the GICS system in 1999 because of the need for a single industry classification system to accommodate all industries in global markets. It was designed for financial professionals, and uses a two- to eight-digit system with four levels of classification. While the SICS and NAICS are ideal for economic and business research, the GICS is ideal for financial and investment research.Footnote 3 Thus, stocks are eliminated from the SRI investment universe based on their SICS, GICS and NAICS codes. Consequently, in total, three SRI Fama–French models are estimated, one for each industry classification system.

We believe that the GICS is ideal for filtering SRI-prohibited industries and stocks. Owing to its two- to eight-digit system, there exist separate classifications for each prohibited SRI industry (i.e. alcohol, tobacco, defense and gambling), whereas the SICS and NAICS group gambling with casino hotels and defense with aerospace. The separation of these industries is vital to accurately exclude stocks from the SRI Fama–French model consistent with each SRI product screen. Other studies have noted that using SICs/NAICs results in grouping defense/aerospace and gambling/casinos (Hong and Kacpercyzk 2009). Because the SICS and NAICS do not match the SRI product screens perfectly, we focus on GICS (but report SICs/NAICs results too).

To achieve the goals of the study, we needed to replicate the Fama–French factors using our own programs and the data from CRSP/COMPUSTAT.Footnote 4,Footnote 5 Our replication achieves extremely high fidelity with the standard Fama–French factors achieving return correlations of 0.991 and 0.974 for the SMB and HML factors, respectively. When all the SRI-prohibited stocks have been eliminated from the investment universe, the Small Minus Big (SMB) and High Minus Low (HML) factors are recalculated using our software which replicates the method of Fama and French (1993). That is, six value-weighted portfolios excluding the SRI-prohibited stocks (Small/Low, Small/Medium, Small/High, Big/Low, Big/Medium and Big/High) are created and then used in the construction of the size and value portfolios. Here, SMB is the average return on the three small portfolios less than the average return on the three big portfolios, while HML is the average return on the two value portfolios less the average return on the two growth portfolios. The market return in the Fama–French model, RMRF, is also recalculated after eliminating all SRI-prohibited stocks from the market.Footnote 6

A further complication arises with product screens since they occur in three forms: negative, positive and ‘best-of’ screening.Footnote 7 We focus on SRI funds that use negative screens, since they are clear about which SRI-prohibited stocks to eliminate from the Fama–French universe. However, we do not ignore the two other screening methods. We create three data samples: sample 1 includes the 25 funds that use only negative screening for SRI-prohibited stocks; sample 2 extends upon sample 1 and includes funds that combine negative screening with best-of screening; Finally, sample 3 includes all the above funds plus funds that use purely best-of screening. As we include more imprecise screening and move from sample 1 to sample 3, we expect (and find, see below) our results to become weaker. This is because it becomes less clear how the more imprecise screening translates into stock prohibitions.

Unfortunately, this leads to a large reduction in the number of SRI funds that can be used in the analysis. While larger sample sizes are generally preferred, in this context they also introduce exposure to the more imprecise best-of screens. For example, it is tempting to enlarge the data set by including both negative and best-of product screens: Those SRI fund managers who apply best-of screens invest in SRI-prohibited stocks that least contravene SRI goals.Footnote 8 If SRI funds that use best of screens are included in our data sample, then there will be a mismatch between those stocks used to estimate the Fama–French risk factors of our novel model and the portfolios of SRI Managers that use best of screens. So, the results from samples 2 or 3 may be contaminated by SRI-prohibited stocks, leading to less accurate models, compared with those estimated using sample 1 only. Therefore, whilst we focus on sample 1 in the analysis, we also report results for SRI funds that use other product screening approaches (such as positive and best-of screens; see footnote 7).

We consider US SRI equity mutual funds over the period 1984–2006.Footnote 9 This is the longest sample of SRI mutual funds studied to date.Footnote 10 While the Social Investment Forum (2007) reports 201 SRI mutual funds in the USA, this study concentrates exclusively on the 66 SRI equity style funds.Footnote 11 After all the Fama–French model is calibrated using equities. Screening data is collected from the Social Investment Forum, while monthly fund data, index data and Fama–French factors are sourced from the merged Center for Research in Security Prices (CRSP)/Compustat database.

Finally, to construct the filtered Fama–French benchmark model, monthly returns from all stocks trading on the New York Stock Exchange, American Stock Exchange, and NASDAQ apart from financials (as per Fama and French 1993 and above) are sourced from the CRSP/Compustat database.

It is important to note that this study focuses on US SRI funds, the largest SRI industry in the world. The prior focus of published research is also on the USA, which makes our results comparable with those of prior studies.

Results

Our aim is to re-estimate the Fama–French model so that it is more appropriate to the SRI funds management industry. We argue that it is not appropriate to measure skill using the standard Fama–French model as it includes SRI-prohibited stock. We shall show below that when SRI-prohibited stock are removed the Fama–French market, HML and SMB risk factors are subtly affected, resulting in evidence of skill for some SRI funds where none existed using the standard Fama–French risk factors.

We begin by examining the affect of removing SRI-prohibited stock from each of the three industry classification systems on the Fama–French HML and SMB factors. Figure 1 examines the relative differences between each of the three filtered benchmark models and the standard unfiltered benchmark for the standard HML factor (which is zero in Fig. 1). The SICs and NAICs filtered factors track each other quite closely, whereas GICs deviates from 1993; presumably, this is due to the separation of gambling/casinos and defense/aerospace. Of the three industries filtering approaches, it is clear that the GICS filtered HML produces the smallest difference to the standard Fama–French HML factor. In Fig. 2, it is also evident that all three SRI-prohibited models produce a higher SMB premium than the standard Fama–French SMB, with the GICS SMB increasing the most. Once again, the three track each other closely again until 1993, whence they diverge abruptly. We also confirm the work of Fabozzi et al. (2008) and Hong and Kacpercyzk (2009) who report that sin stock outperform and that therefore, once removed, the market risk premium is reduced in our SRI Fama–French model.

Fig. 1
figure 1

Cumulative wealth of an investment in HML. This figure presents the cumulative wealth of an investment of $1 in HML. The graph compares an investment in the unfiltered HML compared to an investment in the SICS, GICS and NAICS filtered HML over the period 1984–2006

Fig. 2
figure 2

Cumulative wealth of an investment in SMB. This figure presents the cumulative wealth of an investment of $1 in SMB. The graph compares an investment in the unfiltered SMB compared to an investment in the SICS, GICS and NAICS filtered SMB over the period 1984–2006

In summary, we can see that removing SRI-prohibited stocks slightly increases both the Fama–French value and size factors. We will focus on GICs filtering for SRI-prohibited stock since there exists a separate industry classification for each prohibited SRI industry, and it provides the most similar performance when compared to the standard Fama–French HML factor. We can see that the relationship between the conventional risk factors and the SRI factors is complex. This warrants the use of the new risk factors. We shall demonstrate that the changes in the size and value premium associated with these new SRI industry filtered size and value factors, while subtle, have an impact on the estimate of alpha, and hence the detection of manager skill.

Although not reported in detail here, we removed stock from each prohibited industry in turn and studied what effect this had on the HML and SMB risk factors. For HML, we found that tobacco affected returns most, followed by defense, alcohol and finally gambling. For the SMB risk factor, tobacco was also most important followed by small contributions of approximately equal size from the alcohol, defense and gambling industries.Footnote 12 We considered how the removal of each SRI-prohibited stock affected the HML premium by focusing on the four constituent portfolios comprising the HML. We found that on one hand, for value companies, the increases in the HML premium occurred primarily due to the increased returns of large, value companies (BH) relative to the decreased returns in small high value companies (SH). On the other hand, when we considered growth companies, we found that relative decreases in returns of big growth companies (BL) compared to small growth companies (SL) further slightly enhanced the HML premium (see author for details).

In summary, we have seen that once stock from prohibited SRI industries are removed, complex and subtle changes in the HML/SMB risk factors occur. It appears that SRI fund managers face slightly different risks when compared with unconstrained managers. This justifies the estimation of the new risk factors when measuring SRI manager skill.

Next, we examine if evidence for SRI mutual fund outperformance exists when using our novel Fama–French model. To achieve this, we model the returns of those SRI funds that use negative screening (data sample 1), minus the SRI-filtered market rate, and use the SRI-filtered HML and SMB factors as the explanatory variables. Table 1 reports the full regression results for data sample 1 using the unfiltered and GICS, SICS and NAICS filtered Fama–French benchmarks.Footnote 13

Table 1 Times series results using the alternative filtered benchmark for sample 1

Table 1 shows that no funds outperform when using the standard Fama–French model, but when the GICS filtered Fama–French model is used, eight funds outperform (funds 1, 2, 6, 7, 12, 14, 15 and 22), with statistically significant alphas at 5 %.Footnote 14 For data sample 1, the average alpha of those funds, which are significantly different to zero is 0.0034. The SICS and NAICs filtered benchmarks produce almost identical alphas suggesting a robustness to our findings. Furthermore, Appendix 1, reports that the results are robust over time. Importantly, this means that we can detect evidence of manager skill using our novel SRI Fama–French model that does not exist using the standard Fama–French model.

Does this outperformance occur because of changes to Fama–French factor loadings or is it genuine skill (i.e. purely an increase in alpha without an increase to risk loadings)? Panel A of Fig. 3 graphs the alphas from the regressions using the unfiltered benchmark and the GICS filtered benchmark. It is clear that all funds outperform, although only eight have significant alpha, with the GICS filtered benchmark. Panels B–D of Fig. 3 depict the market risk premium (RMRF), HML and SMB coefficients from the regressions. These figures indicate that there is very little difference in the risk loadings between the unfiltered and GICS filtered benchmarks. This suggests that SRI funds are not changing their exposure to the new GICs filtered SMB, HML and RMRF factors to compensate for eliminating SRI-prohibited stocks from their investment universe. Thus, this provides evidence that when an appropriate asset-pricing model is used as a benchmark, such as our novel Fama–French model, there is evidence of pure outperformance (i.e. positive, statistically significant alpha).

Fig. 3
figure 3

Unfiltered coefficients versus GICS filtered coefficients. These panels represent the unfiltered coefficients graphed against the GICS filtered coefficients for each of the 25 funds in sample 1. Panel A graphs the unfiltered and GICS filtered alpha, panel B graphs the unfiltered and GICS filtered RMRF, panel C graphs the unfiltered and GICS filtered HML, and panel D graphs the unfiltered and GICS filtered SMB

To confirm these visual results, we complete a statistical analysis of the difference between the unfiltered and GICS filtered coefficients for all SRI funds in data sample 1. A t test is performed to test for statistically significant differences in alpha and the RMRF, SMB and HML coefficients for the SRI funds when using the GICS Fama–French model versus the standard Fama–French model, as reported in Table 2. While the test results show that the difference in alpha is statistically significant at the 1 % level, importantly, the differences in the RMRF, SMB and HML coefficients are not statistically significant at any level.Footnote 15 These results also hold when the differences between the unfiltered and SICS and NAICS filtered coefficients are tested. This supports the notion that the outperformance is due to genuine management skill and not differing exposure to risk factors. The nonparametric Wilcoxon ranked pairs test is also conducted as a robustness test. Its results, reported in Table 3, confirm the t test results. From this careful analysis, it appears that the statistically significant alpha for these nine funds indicates genuine skill (and not simply higher loadings on Fama–French size and value risk factors). This skill is detected because we are using appropriately filtered HML and SMB factors in our benchmark asset-pricing model.

Table 2 The results of t tests for unfiltered and filtered coefficients
Table 3 Wilcoxon ranked pairs test for unfiltered and filtered coefficients

We also report the regression results for data samples 2 and 3 in Tables 4 and 5. In Table 4, funds 26–32 are the additional funds combined with sample 1 that together create sample 2, and in Table 5, funds 33–66 are the additional funds for sample 1 and 2 that create sample 3. The results show very similar patterns. In data sample 2, while no funds have statistically significant alpha using the standard Fama–French model, funds 26, 27 and 31 show evidence of skill when using the GICS, SICS and NAICS Fama–French models. Furthermore, the SMB and HML exposures stay the same in a fashion similar to data sample 1. Again, when regressions are run with SICS and NAICS filtered benchmarks in Panel B, the results are similar. Interestingly, for data sample 3, in Table 5 the results show that the unfiltered Fama–French model produces significant underperformance for six funds (Funds 42–44, 46, 50 and 60), but not when the GICS filtered Fama–French model is used. This suggests that, at least with sample 3, SRI funds are not actually underperforming when an appropriate benchmark model is used. Of course, despite the larger sample size of data sample 3, limitations (discussed above) related to the more lax screening approach still exist. In summary, our results are robust to the specific choice of data sample (and hence screening approach), and similar conclusions can be drawn when restrictions are relaxed resulting in larger sample sizes and when different industry classification codes are used.

Table 4 Time series results using the alternative filtered benchmark for sample 2
Table 5 Time series results using the alternative filtered benchmark for sample 3

Overall, these results indicate that there appears to be genuine manager skill amongst some SRI funds. This has been detected by our novel approach of removing SRI-prohibited stock from the Fama–French investment universe and the re-estimating SRI specific HML and SMB risk factors. Furthermore, it appears that this outperformance is not due to a change in exposure to the RMRF, SMB and HML factors. That is, SRI funds are not gaming by changing their risk exposures to the Fama–French risk factors to generate higher returns. The results suggest that the increase in alpha is purely due to managerial skill.Footnote 16

Conclusion

The aim of this paper is to examine the performance of SRI mutual funds. SRI managers are unusual amongst fund managers in that they have strong, unambiguous investment constraints based on social goals. We focus here on the most prevalent type of screen, negative product screens that translate directly to the removal of ‘sin’ stocks from the SRI managers’ investment set.

Our research makes a number of contributions: firstly, we estimate SRI manager skill from 1984 to 2006, representing one of the longest tests of skill in the SRI literature. Next, we use the GICs industry classification code that accurately aligns SRI-prohibited industries with those eliminated from the SRI investment set unlike NAICs and SICs used in prior literature; and most importantly, we eliminate ‘sin’ stock from our benchmark Fama–French asset pricing model used to estimate skill.

Previous research has provided inconclusive results about the existence of skill amongst SRI fund managers. We know that estimates of skill depend on the benchmark asset-pricing model used, and therefore if we do not correctly account for SRI investment constraints in our model, those estimates will be biased.

Rather than using screening intensity (Barnett and Salomon 2006) to account for the investment constraints under which SRI managers operate, we utilize a more accurate and elegant approach. That is, we eliminate SRI-prohibited stock and re-estimate the benchmark Fama–French asset-pricing model used to measure skill. We show that the effect of removing SRI-prohibited industry stocks from the SRI investment set leads to complex, systemic changes in the value and size risk factors driving SRI returns and that therefore new risk factors, loadings and ultimately a new SRI constrained Fama–French model must be estimated.

Once we re-estimate skill using this new SRI constrained risk Fama–French model we find that certain SRI funds demonstrate skill, unlike when using the standard model. These results are important for the SRI mutual fund industry. The impact of excluding the ‘sin’ stocks and their large returns appears to be very important when estimating skill. We show that once the investment constraints under which SRI fund managers are considered, evidence of SRI manager skill becomes apparent. Overall, these findings facilitate a better understanding of SRI fund performance and are thus an important contribution to the current body of SRI academic literature.