Skip to main content
Log in

Strategic behavior in regressions: an experimental study

  • Published:
Theory and Decision Aims and scope Submit manuscript

Abstract

We study experimentally in the laboratory the situation when individuals have to report their private information about a (dependent) variable to a public authority that then makes inference about the true values given a known (independent) variable using a regression technique. It is assumed that individuals prefer this predicted value to be as close as possible to their true value (single-peaked preferences). Consistent with the theoretical literature, we show that subjects misrepresent their private information more when an ordinary least squares (OLS) regression is implemented than when the so-called resistant line (RL) estimator is employed. The latter extends the median voter theorem to the two-dimensional setting and belongs to the family of robust estimation techniques. In fact, we find that OLS involves serious biases but the RL estimation is empirically unbiased. Furthermore, subjects never earn less when the RL is applied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

Notes

  1. See Cox and Oaxaca (2008) for a survey on the topic; and Bardsley and Moffatt (2007) for an application to public goods.

  2. Formally, the direct revelation mechanism is manipulable if there is some individual \(i\) and some strategy profile played by the other individuals such that revealing the true preferences is not a best response for individual \(i\). The direct revelation mechanism is strategy-proof if and only if it is not manipulable by any individual; see, Barberà (2001).

  3. We are very grateful to an anonymous referee for her/his helpful advice in this respect.

  4. Observe that the angle of a vector that points from an observation \((x_{i},\tilde{y}_{i})\) in the bi-dimensional space \((x,y)\) to another observation \((x_{j}, \tilde{y}_{j})\)—remember that since \(x_{j}>x_{i}\), the second observation is always to the right of the first one—is simply the angle defined by the vector to the north (counter-clockwise).

  5. We consider \(x_0=0\) and \(x_0=4\) in Experiments 1 and 2, respectively. Note that the range of \(x\) does not affect the data generating process, so this variation is innocuous.

  6. Figures 2, 3, 4, and 5 in the appendix represent the average estimates at the group level.

  7. Observe that the index \(g\) captures the cross-section data of the 24 groups that took part in our experiment (six groups per estimation procedure and informational condition), whilst \(i\) refers to every particular subject.

References

  • Barberà, S. (2001). An introduction to strategy-proof social choice functions. Social Choice and Welfare, 18, 619–653.

    Article  MATH  MathSciNet  Google Scholar 

  • Bardsley, N., & Moffatt, P. G. (2007). The experimetrics of public goods: Inferring motivations from contributions. Theory and Decision, 62, 161–193.

    Article  MATH  Google Scholar 

  • Calsamiglia, C., Haeringer, G., & Klijn, F. (2010). Constrained school choice: An experimental study. American Economic Review, 100, 1860–1874.

    Article  Google Scholar 

  • Cason, T. N., Saijo, T., Sjöström, T., & Yamato, T. (2006). Secure implementation experiments: Do strategy-proof mechanisms really work? Games and Economic Behavior, 57, 206–235.

    Article  MATH  MathSciNet  Google Scholar 

  • Chen, Y., & Sönmez, T. (2006). School choice: An experimental study. Journal of Economic Theory, 127, 202–231.

    Article  MATH  MathSciNet  Google Scholar 

  • Cox, J. C., & Oaxaca, R. L. (2008). The use of market experiments to evaluate the performance of econometric estimators. In C. R. Plott & V. L. Smith (Eds.), Handbook of Experimental Economics Results. Amsterdam: North-Holland.

    Google Scholar 

  • Fischbacher, U. (2007). Z-Tree—Zurich toolbox for readymade economic experiments. Experimental Economics, 10, 171–178.

    Article  Google Scholar 

  • Johnstone, I., & Velleman, P. (1985). The resistant line and related regression methods. Journal of the American Statistical Association, 80, 1041–1054.

    Article  MATH  MathSciNet  Google Scholar 

  • Lazear, E. (2000). Performance pay and productivity. American Economic Review, 90, 1346–1361.

    Article  Google Scholar 

  • Moulin, H. (1980). On strategy-proofness and single-peakedness. Public Choice, 35, 437–455.

    Article  Google Scholar 

  • Pais, J., & Pintér, A. (2008). School choice and information: An experimental study on matching mechanisms. Games and Economic Behavior, 64, 303–38.

    Article  MATH  Google Scholar 

  • Perote, J., & Perote-Peña, J. (2004). Strategy-proof estimators for simple regression. Mathematical Social Sciences, 47, 153–176.

    Article  MATH  MathSciNet  Google Scholar 

  • Saporiti, A. (2009). Strategy-proofness and single-crossing. Theoretical Economics, 4, 127–163.

    Google Scholar 

  • Sprumont, Y. (1991). The division problem with single-peaked preferences: A characterization of the uniform allocation rule. Econometrica, 59, 509–519.

    Article  MATH  MathSciNet  Google Scholar 

  • Thomson, W. (1983). Problems of fair division and the egalitarian principle. Journal of Economic Theory, 31, 211–226.

    Article  MATH  MathSciNet  Google Scholar 

  • Tukey, J. (1970). Exploratory data analysis. Reading, MA (Limited Preliminary Edition): Adison-Wesley.

    Google Scholar 

Download references

Acknowledgments

J. Perote and J. Perote-Peña gratefully acknowledge financial support from the Junta de Castilla y León and the Spanish Ministry of Economics and Competitiveness through the projects SA218A11-1 and ECO2013-44483-P, respectively, and M. Vorsatz from the Fundación Ramón Areces and the Spanish Ministry of of Economics and Competitiveness, through the project ECO2012-31985. The helpful assistance of the LINEEX staff is also gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Javier Perote.

Appendices

Appendix: Estimated lines

Fig. 2
figure 2

The average fitted regression line (in red) and the true underlying process (in black) in the OLS treatment for each of the six groups in Experiment 1. (color figure online)

Fig. 3
figure 3

The average fitted regression line (in red) and the true underlying process (in black) in the RL treatment for each of the six groups in Experiment 1. (color figure online)

Fig. 4
figure 4

The average fitted regression line (in blue) and the true underlying process (in black) in the OLS treatment for each of the six groups in Experiment 2. (color figure online)

Fig. 5
figure 5

The average fitted regression line (in blue) and the true underlying process (in black) in the RL treatment for each of the six groups in Experiment 2. (color figure online)

Instructions experiment 1 (translated from Spanish)

This experiment explores the design of an income tax system. You will be assigned to a group of 8 subjects that remains constant during the 48 rounds that the experiment lasts. In every round, you may earn a quantity measured in ECU (experimental currency units) that will be converted in euros at the end of the experiment at the rate

$$\begin{aligned} 10 \, \text{ ECU } = 1 \, \text{ Euro }. \end{aligned}$$
  1. 1.

    In every round, you will be assigned an income \(R\) and a contribution \(C\). The participants from your group have different incomes of the following quantities: \(\{2,4,6,8,10,12,14,16\}\). The contribution of every group member depends on the income and will be randomly drawn from the following process:

    $$\begin{aligned} C = R + e, \end{aligned}$$

    where \(e\) is a normally distributed random variable with mean zero and variance four. This means that if your income is \(R\), then your contribution \(C\) will be in the interval \((R-4,R+4)\), although with a small probability of 5 % it may be outside this interval. Note that all participants from your group know the incomes but NOT the contributions of the other co-players; that is, every participant only knows her own contribution.

  2. 2.

    The only decision you have to take each period is to report a contribution. The reported contribution can be a (rational) number between 0 and 24.

  3. 3.

    Given the reported contributions of all group members, an estimation of the parameters of an income tax system will be computed: the intercept (lump-sum) and the slope (income percentage). This computation will be based on a simple rule that will be explained below in the section “estimation method”.

  4. 4.

    Given the estimates for the intercept and the slope, an estimated contribution \(C^*\) will be computed for every subject in the following way:

    $$\begin{aligned} C^* = \mathrm{intercept} + \mathrm{slope} \times R. \end{aligned}$$
  5. 5.

    Each round, the payoff you receive will be the maximum of zero and

    $$\begin{aligned} 5- |C-C^*|. \end{aligned}$$

    Consequently, your monetary benefits from the experiment will be the higher the closer the estimated contribution is to your true contribution.

Next, we display a figure as an illustration of those you will find throughout the experiment.

figure a

In the graph on the left hand side of the figure, the red lines capture the bands where the contributions of the eight subjects of your group should be placed with 95 % probability. Your income (R) is 10, your contribution (the red point) is 15, and your reported contribution (the back point) is 16. The green line indicates the estimated contributions that are obtained from the reported contributions of all group members. In particular, the yellow point represents your own estimated contribution given the estimated income tax system.

On the right hand side of the figure, you find the values of the main variables, which will be collected in a table as the experiments progresses. Every period it is displayed the value of your income, your contribution (the red point), your reported contribution (the black point), your estimated contribution (the yellow point), the difference between your true and your estimated contribution (the blue line segment) and your payoff from the period.

1.1 Estimation method

In every period, the estimation of the income tax line \(C^* = \mathrm{intercept} + \mathrm{slope} \times R\) requires the estimation of both the “intercept” and the “slope” parameters given the known values of the income \(R\) and the reported contributions of the eight group members. Hereafter, we show an example which explains graphically the procedure to obtain these estimates assuming that reported contributions are those in the next picture below.

figure b

1.1.1 Specific part: treatment OLS

Given these observations, the estimated line (the green line in the figure below on the left hand side) will be the one that minimizes the sum of the squared vertical distances (errors) between the reported contributions and those of the estimated line. Note the sum the errors above (the blue lines) and below (the red lines) the estimated line are exactly the same. In the example, it is assumed that the reported contributions are the truly assigned contributions. The estimated contributions are the values of the contributions for every income level on the estimated line (the green line). The final payoffs of the period are computed as 5 minus the distance between the true and the estimated contribution.

figure c

We are now going to illustrate the impact of your reported contribution on payoffs. Starting with the numbers in the table above, what would have happened if the subject with income 16 and the contribution 10 had reported a contribution of 2.3 (red point in the figure below)? You can observe the impact of such decision on the estimated line, which would have changed from the “light green” to the “dark green” line, in the plot below. The table on the right highlights the effects on the payoffs for all subjects in the group. It is clear that the subject that changed her reported contribution would increase its payoff from 0.7 to 3.9.

figure d

Finally, what would have happened if the subject with income 6 had reported 15.8 instead of her true contribution, 10.8? The following figure and table illustrate this case. The subject with income 6 would increase its payoff from 1.8 (her payoff for the initial case where all subjects report their true contribution) to 2.7.

figure e

1.1.2 Specific part: treament RL

Given these observations, the estimated line is obtained as follows.

  1. 1.

    Take the subject with income 2 and trace the vectors that pass through her reported contribution and those of the subjects with incomes 12, 14 y 16. From these three lines, choose the median or central one (the red line).

    figure f
  2. 2.

    Take the subject with income 4 and trace the vectors that pass through her reported contribution and those of the subjects with incomes 12, 14 y 16. From these three lines, choose the median or central one (the red line).

    figure g
  3. 3.

    Take the subject with income 6 and trace the vectors that pass through her reported contribution and those of the subjects with incomes 12, 14 y 16. From these three lines, choose the median or central one (the red line).

    figure h
  4. 4.

    Now take the three median vectors chosen in the last three steps (associated with the observations of the subjects 2, 4 y 6, respectively) and choose the median (central) one of these thee vectors as the estimated line.

    figure i

Consequently, the estimated line always passes through two of the reported observations: those with the median reported contribution of the subjects with the three lowest (2, 4 an 6) and largest (12, 14 and 16) incomes. Furthermore observations of subjects with income 8 and 10 are always discarded in the procedure.

Given the estimated line in the figure above, the initial payoff of 5 ECU for every participant will be reduced by the vertical distance between the true contribution C and the one corresponding to the estimated line. Let us assume that all subjects reported their true assigned contributions except for the subject with income 16, whose true contribution is 12.1, instead of 10, which is what she reported. In this case, payoffs are 5 ECU minus the vertical distances from their reported values to the estimated ones (depicted in blue).

figure j

Note that if the subject with income 16 had reported her true contribution 12.1 or whichever other value less than 13.3, she would have obtained the same payoff since it would have not changed the estimated line (given the same values for all other subjects). Still, if he had reported a higher contribution than 13.3 (estimated contribution for all the true contributions), for example 15, the estimated line and her expected payoff (and that of all other participants) would have changed. Finally, we present a figure and a table with the estimated line and corresponding payoff in this case.

figure k

1.2 Final comments

Now you will be able to practice in your computer with similar examples during six different periods. The payoffs of these rounds will not affect your final payoffs. Once you finish these examples and after filling out a brief questionnaire, the experiment will start. Remember that the experiment lasts 48 periods and you will play all of them within the same group composition.

Instructions Experiment 2 (translated from Spanish)

This experiment consists of 48 rounds. You will be assigned to a group of 8 subjects that remains constant during the whole experiment. In every round, you may earn a quantity measured in ECU (experimental currency units) that will be converted in euros at the end of the experiment at the rate

$$\begin{aligned} 10 \, \text{ ECU } = 1 \, \text{ Euro }. \end{aligned}$$

Your final payoff will be equal to the sum of the payoffs from each round.

  1. 1.

    In every round, you will be assigned two values \(R\) and \(C\). The possible values of \(R\) are 6, 8, 10, 12, 14, 16, 18, and 20 (in every round, each participant of your group will get one different of those eight values and you will receive each possible value six times over the course of the experiment). The value of \(C\) will be drawn randomly from the following process:

    $$\begin{aligned} C = R + e, \end{aligned}$$

    where \(e\) is a normally distributed random variable with mean zero and variance four. This means that your value \(C\) will be in the interval \((R-4,R+4)\), although with a small probability of 5 % it may be outside this interval. Each participant only knows her/his own value \(C\).

  2. 2.

    The only decision you have to take each period is to report your value \(C\). The reported value can be any (rational) number between 0 and 26. So, the reported value does not necessarily have to coincide with the true value \(C\).

  3. 3.

    Given the reported values of all group members, an estimation of two parameters \(b_1\) and \(b_2\) will be performed with the help of a simple rule that will be explained in the section Estimation method.

  4. 4.

    Given the values \(b_1\) and \(b_2\), a value \(C^*\) will be computed for every subject with the help of the following function (line):

    $$\begin{aligned} C^* = b_1 + b_2 \cdot R. \end{aligned}$$
  5. 5.

    Your payoff in each round is the maximum of zero (there are no negative payoffs) and

    $$\begin{aligned} 5- |C-C^*|. \end{aligned}$$

We display a figure as an illustration of those you will find throughout the experiment.

figure l

In the graph on the left hand side of the figure, the red lines capture the bands where the \(C\) values of the eight subjects of your group should be placed with 95 % probability. Your \(R\) value is 20, your \(C\) value (the red point) is 18.9, and your reported \(C\) (the black point) is 23. The green line indicates the estimated line that is obtained from the reported \(C\) values of all group members. In particular, the yellow point represents your own estimated value \(C^*\).

On the right hand side of the figure, you find the values of the main variables, which will be collected in a table as the experiments progresses. Every period, the table displays your values \(R\) and \(C\), your reported \(C\) (the black point), your estimated \(C\) (the yellow point), the difference between your true and your estimated \(C\) (the blue line segment), and your payoff from the period.

1.1 Estimation method

The values \(b_1\) and \(b_2\) (or, the estimated line) are calculated on the basis of the reported \(C\) values of the eight group members. The next figure presents an example of 8 reported values:

figure m

1.1.1 Specific part: treatment OLS

The graph on the left hand side of the next figure shows the estimated line given these observations. The estimated line is defined by the following property: the sum of the distances between the reported and the estimated \(C\) values that are above the estimated line is exactly equal to the sum of the distances between the reported and the estimated \(C\) values that are below the estimated line.

figure n

The table on the right hand side of the figure shows the \(R\) and \(C\) values, the estimated \(C\) value (\(C^*\)), and the resulting payoff (which is the maximum of 0 and \(5-|C-C^*|\)). That is, the payoff is higher for a lower difference between the estimated and the true \(C\).

1.1.2 Specific part: treatment RL

The graph on the left hand side of the next figure shows the estimated line given these observations. The estimated line is defined by the following properties

  1. 1.

    It passes through one of the three reported \(C\) values corresponding to the \(R\) values of 6, 8, and 10. Additionally, one of the other two reported \(C\) values is above and the other below the estimated line.

  2. 2.

    It passes through one of the three reported \(C\) values corresponding to the \(R\) values of 16,18, and 20. Additionally, one of the other two reported \(C\) values is above and the other below the estimated line.

Observe that in the example, the estimated line passes through the reported \(C\) values corresponding to \(R=6\) and \(R=18\), leaving the reported values of \(R=6\) and \(R=16\) above and the reported values of \(R=10\) and \(R=20\) below the estimated line.

figure o

The table on the right hand side of the figure shows the \(R\) and \(C\) values, the estimated \(C\) value (\(C^*\)), and the resulting payoff (which is the maximum of 0 and \(5-|C-C^*|\)). That is, the payoff is higher for a lower difference between the estimated and the true \(C\).

1.2 Final comments

Now you will be able to practice in your computer with similar examples during six different periods. The payoffs of these rounds will not affect your final payoffs. Once you finish these examples and after filling out a brief questionnaire, the experiment will start. Remember that the experiment lasts 48 periods and you will play all of them within the same group composition.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Perote, J., Perote-Peña, J. & Vorsatz, M. Strategic behavior in regressions: an experimental study. Theory Decis 79, 517–546 (2015). https://doi.org/10.1007/s11238-014-9473-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11238-014-9473-9

Keywords

JEL Classification

Navigation