Experimental economy applied to the cooperation and confidence in online communities : a series of innovative experiments led at the CREM (Rennes 1)

, par David Masclet, Laurent Denant-Boemont, Thierry Pénard

Within the framework of this research project, researchers of the CREM (Rennes 1) carried out a series of experiments on the experimental platform of the LABEX, which aimed at analyzing the role of information, reputation, confidence and incentives in a context of repeated relations in networks (online communities, intranet and extranet).

These experiments aimed at identifying and charecterizing the technical, behavioral, psychological and institutional factors that can encourage or destabilize collective coordination in networks.

The success of auction websites and virtual market places such as eBay or AmazonMarketPlace, where million of Net surfers proceed to commercial transactions, constitute a challenge to the economic rules. Indeed, the anonymity permitted by the Internet and the facility to enter or leave these virtual market places are not factors in the principle favourable to the development of the commercial exchanges. The possibility of changing identity easily and the physical distance between the purchaser and the saler can create a climate of mistrust between the partners and cause opportunist behaviors (unpaid good, delay in the sending of the good or sending a product which isn’t conform to the product described). Moreover, the partners cannot resort to threats of future reprisals to guarantee the good course of their transactions, since meeting their partner again is very uncertain. (DangNguyen and Pénard, 2004).

Nevertheless, how can the success of virtual market places such as eBay be explained ? The same question can arise regarding virtual communities in which anonymous Net surfers help each other, agree to provide free information or contribute to the development of open softwares. How is confidence created and how are the risks of opportunism overcome in a virtual community ?

Three principal classes of community can be generally distinguished : epistemic communities of open software type, practical communities of forum type and peer-to-peer sharing community type. In fact, these various online communities do not have the same abilities or capacities to stimulate the cooperation and to reduce the stowaway behaviors (a behavior consisting in consuming without contributing). In the epistemic communities, which produce or create knowledge (open softwares), the access requires skills (high cost of entry and exit), which reduces the risks of opportunism and makes it possible to pressurize the peers. Moreover, in these types of communities, it is often preferable that everyone does not contribute, but only the most qualified. In the practical communities which exchange advices, opinions, expertises (forums of users), the access is easier (and the exclusion more difficult). In these types of communities, it can be interesting to have a large number of contributors in order to obtain average opinions. However, the contributions do not all have the same value and it is very important that the members having a better expertise or a more advanced taste contribute more intensely than the others (quantitative and qualitative aspects).

In the sharing communities, often of large sizes, where resources (files, contents) are exchanged/shared, the cost of entry and exit is rather low. There, it is important that many contribute in order to reduce the risks of congestion to the access to the resources.

It is clear that these various types of community are not facing the same risks of stowaway or opportunism. The needs of contribution are different (some communities need a small number of contributions of great quality, whereas others need many contributions, even modest). In addition, these communities do not have the same instruments or the same aptitudes to generate cooperation and to overcome the problems of opportunism (differences in the costs of entry and exit, in the capacity to screen the entries and to exclude or not, in the size of the community).

We proceeded to a first series of experiments rather centered on the cooperation in a community of epistemic nature. One second series of experiments was focused on confidence and reputation in a sharing community like eBay. We will for each one of these experimental studies present the context, the protocol of the experience and the achieved results.

Incentive and cooperation in an epistemic community

Aims of the study

The objective of this study undertaken by Laurent Denant-Boemont and Thierry Pénard, researchers at the CREM (Rennes 1) was to provide a better understanding on how a community in network and whose finality is to produce knowledge or services is managed : open software communities, corporate networks (Intranet and Extranet type). The members of this community are thus solicited to provide services (expertise, development of softwares, recommendation or rating). What are the inciting mechanisms that will emerge in this type of environment ? The point of this study is to be able to observe in laboratory the genesis and the management of an experimental community, in different environments.

Design of the study

The experiments are based on an ideal model of strategic interactions developed by Dang N’Guyen and Pénard (2005, 2001). In this community, individuals can emit requests which they address to the other community members, complemented with various types of incentives. The processing of these requests makes it possible to produce a service. The approached members can choose to process this request or to decline the offer, hoping that it will be processed by another member. However, the waiting can be expensive collectively in the extent that the value of the service decreases with time (delay cost). If the person who sends requests wants to maximize his profit or his utility, what incentives does he have to send to the rest of the community ? The article explores two types of incentives : a monetary remuneration to process the request and a more or less open access (free) to the service derived from the processing of the request. In this last case, the issuer commits itself to the community to diffuse the entire or just a part of the result of the processing of the request : for example, it can allow a free access or a partial access to certain functionalities of a software or its source code.

The theoretical predictions of this community model are rather simple : the dominant strategy of the issuer is to propose a null monetary remuneration and a total degree of access on the results of the processing of the request. In addition, the optimal strategy of the issuer does not depend at all on the value of the request or the delay cost further to a refusal to process the request. This ideal model resulted in a series of experiments in order to see whether the choices of the subjects were in accordance with the predicted strategies.

The experimental design is based on communities of reduced sizes (5 subjects) whose members receive randomly the roles (they issue or receive requests). In addition, the receivers are heterogeneous as for the cost of the processing of the request. Moreover, the experimental processing is iterative : the issuer makes a proposal, which is proposed randomly to a first player. If he refuses, another player is sought to proceed the request, always randomly, but the product’s value resulting from the processing of the request is decreased ; etc. The period of play stops when one of the members of the group agrees to process the request (see in the appendix for the detailed instructions of the experiment).

This game is implemented within the framework of four experimental processings, which differ from one to another regarding the value level of the processing of the request and the delay cost. The players repeat fitfteen times this game, each period is the occasion of randomly redistributing the roles within the groups made up at the beginning of the game.

Achieved results

Figures 1 and 2 indicate the choices of the subjects as regards incentives in order to proceed the requests. It should be remembered that each issuer of request had initially to indicate the remuneration he granted to the person who processed his request (his choices ranged from 0 to 4, with a step of 1). A degree of opening or access to the result of the processing of the request also had to be chosen : 0%, 25%, 50%, 75% or 100%. A level of access equal to 100% implies a total access (all the members of the group benefit completely by the profit resulting from the processing of the request) and a level equal to 0% implies a total closing (only the issuer benefits by the profit of the request if it is processed). It is pointed out that the strategies predicted are, as regards incentives, seldomly complied with. Monetary remuneration is null in only one case out of ten. In 69% of the cases, the issuers choose a moderate remuneration amount, but positive (equal to 1 or 2) and in only 4% of the cases, they propose the maximum remuneration.

Distribution des rémunerations monétaires choisies par les émetteurs de requête.

Distribution of the degree of access chosen by the request issuer.

Concerning the degree of access to the result of the processing of the request (graphic 2), the choices are in accordance with he expectations. About half of the issuers choose to open totally and 83% choose an opening higher or equal than 50%. In addition, the choice of the instruments (in particular monetary incentives) depend on the value of the processing of the request, but also of the delay cost, which goes against the theoretical predictions resulting from the model of interactions.

Distribution du degré d’accès choisi par les émetteurs de requête.

Distribution of the monetary remunerations chosen by the request issuers.

It is pointed out that the refusal number is relatively low when looking at the decisions of the individuals approached to process the requests, with an average of one refusal by request. The level of cooperation is thus high in this experimental community, undoubtedly because of the strong incentives coming with the requests, but also because the individuals are at the same time issuer and receiver of requests, promoting the reciprocity and the feelings of membership in the community. In addition, the refusal number decreases with the delay cost, but is not affected by the value of the request.

In the end, this experimental study provides interesting perspectives on the management of online communities and the emergence of inciting mechanisms within these decentralized communities.

Confidence and reputation in an information sharing community

The second study undertaken by David Masclet and Thierry Pénard, researchers at the CREM (Rennes 1) is focused on the management the information sharing communities, through the case of eBay.

Context and state of art

It is a fact that the users of eBay form an online community, to whom eBay places at disposal a commercial exchange platform, which includes, in particular, a rating system working on the principle of sharing informations. These informations are about the course of the transactions carried out on the platform and the reliability of the users of eBay. Precisely, this system enables the purchaser and the saler, at the end of each transaction, to address a positive, neutral or negative rating to his partner and, possibly, add comments. Each eBay participant is thus characterized by his rating profile, from which a score according to the following formula is calculated [1] : each positive rating adds +1, each neutral rating 0 and each negative rating -1. When a participant plans to carry out a transaction with another person, he thus has an idea of the partner’s reliability. He can also refer to the opinions of the previous partners. He finally has informations on the reputation of the appraisers and can thus know what creditibility has to be given to each rating. For example, he will not necessarily give the same credence to a negative rating emitted by a person having a bad score or by a person having an excellent reputation.

A certain number of studies aimed at empirically evaluate the impact of the ratings and comments posted by the purchasers and salers at the time of an online transaction (Lucking and Reiley 1999, Kalyanam and McIntyre, 2001, Melnik and Am, 2002, Cabral and Hortacsu, 2005). A synthesis of the various studies on this subject is available , refer to Resnick, Zeckhauser, Swanson and Lockwood (2003). For example, Ghose, Ipeirotis and Sundararajan (2006) found that the margin made by software retailers on AmazonMarketplace increased with their experience (the number of transactions made), their score and the positive comments they had received. On the one hand, Houser and Wooders (2001) studied auctions on Pentium III 500 processors during the autumn 1999 on eBay. They found that a rise of 10% in the number of positive ratings addressed to a saler increases the price that he gets by 0.17%, whereas a rise of 10% in the number of neutral or negative ratings decreases the price he gets by 0.24%. On the other hand, the number of positive, negative or neutral ratings given to the saler does not have any impact on the discharged price. Thus, only the reputation of the saler seems important in the transactions. This is explained by the fact that the risks of opportunism on eBay mainly come from the saler, since the purchaser has to pay before receiving the object. Another experimental study undertaken by Resnick, Zeckhauser, Swanson and Lockwood (2003) confirms Houser and Wooders’s study. Similar batches of old postcards were put on sale under the identity of an experienced saler having a good reputation, then under the identity of an inexperienced saler. Concerning the tendency of the purchasers to pay, the difference is about 8%. However, the price difference is not significant between an inexperienced saler with and without negative rating. As for Resnick and Zeckhauser, they reached a more moderate result. No connection is found between the ratings and the selling prices of MP3 players. Nevertheless, it is shown that these ratings have an effet on the probability the transaction is carried out. For instance, a saler without any rating will have 72% of chance to sell his good ?, whereas a profile of rating equal to 70 has 96% of chance to sell the same good.

With a rating system similar to eBay, everyone is thus prompted to be the most honest possible in order to receive positive ratings and acquire a good reputation, allowing in future transactions better buying and selling prices. Moreover, it is expensive to change identity once one has a good reputation (returning back on eBay under a new nickname), which all the more reduces the incentives to be opportunist. Yet, the effectiveness of this rating system hinges on a high participation of the purchasers and salers to this system. Now, it can be tempting to leave it to the others to provide ratings, considering that it requires time and efforts to evaluate. This stowaway behavior, if it spreads, can be harmful for a virtual market place like eBay. It’s the traditional problem of contribution to a public property or more exactly to a community property which is at the disposal of all the users of the virtual market place, without any exclusion, nor competition.

Several studies highlighted this phenomenon of under-contribution to the system of eBay rating. So, Resnick and Zeckhauser (2002) analyzed in detail all the transactions that took place on eBay between February and June 1999, as well as the ratings’ history linked to these transactions. According to their data, 50% of the transactions were evaluated by the purchasers and 60% by the salers. Concerning the ratings of the purchasers, only 0.6% were negative and 0.3% neutrals and on the side of the salers, 1.6% were negative and 0.3% neutrals. When the recipients of these bad ratings are looked at, the salers’s probability to receive a bad rating is all the more important that they are inexperienced (same thing for the purchasers). Besides, Resnick and Zeckhauser underline the predictive power of ratings on the current transactions’ quality. A transaction will have a higher probability to involve a problem (a positive or negative rating) if, in the past, the saler had to deal with a negative rating. Finally, the authors observe phenomena of reciprocity on the ratings. For instance, when the rating of the purchaser is positive, the probability that the saler answers and evaluates the transaction positively is high. Conversely for a negative rating, the probability of answering by a negative or neutral rating is high. The purchasers and salers are undoubtedly prompted to send positive ratings, in the hope that the other part returns a positive rating making it possible to improve your profile. Just as some participants could prefer to relinquish putting a « justified » negative rating fearing in return an « unjustified » negative rating. This fear of retaliation could have perverse effects. Some participants having a high reputation could behave in an opportunist way towards participants having a weak reputation, without risking negative ratings (while exploiting reprisals threats). Dellarocas, Fan and Wood (2004), have for their extent observed 51.000 biddings on collections of currency coins between April and September 2002 on eBay. They note that 77% of the salers and 67% of the purchasers left a positive rating, in the majority of the cases. From these data, the authors aimed at identifying the motivations of the users of eBay to contribute to this community system of reputation. They count three principal motivations :

 the pure altruism : I evaluate my partners because I know that it is good for the whole community

 the impure altruism or reciprocity : I evaluate my partners because they evaluated me and so I speak favourably of them.

 selfishness : I do not evaluate my partners unless I want to be taken for an altruist and prompt them to evaluate me.

In this last case, it’s a strategical type of rating where the individual decides to evaluate as soon as the transaction ends, if he thinks that he is highly likely to face a reciprocator partner (that will make a positive rating if he gets one). That way, the selfish user can increase his score and thus his reputation. Dellarocas and Al (2004), in order to isolate the rating motivations in each transaction of collections of currency coins, looked up several days after the end of the bidding, if the purchaser and the saler had made ratings. Starting from these informations, they calculate probabilities of evaluating the partner and conditional probabilities of being or not evaluated by the partner. Should it be pure altruism, being evaluated or not shouldn’t have any effect on the probability of evaluating whereas, in case of impure altruism, being evaluated should increase the probability of evaluating (positive effect). Finally, should it be selfishness, being already evaluated should lead the Net surfer not to evaluate anymore his partner (negative effect), while being not evaluated could encourage to evaluate the partner, if the Net surfer thinks he has a real chance to face a reciprocator. The authors find, in fact, that the eBay community is made up at the same time of altruists, reciprocators and selfish persons. The contributions to the reputation mechanism of eBay would thus come under various motivations which go from the pure altruism to selfishness.

Aims of the study

The objective of our experiments are precisely to identify the contributing motivations to a rating system similar to eBay’s and to understand how such a system can create confidence. To do so, an experimental approach was resorted, based on the trust game. This game is a good approximation of what a transaction on eBay can be. Indeed, on eBay, one of the trade partners (the purchaser) must send a payment to the other partner (the saler), while hoping to be delivered in return. The purchaser has thus to trust the saler. The analogy with the trust game is clearly seen, in which two players receive funds, one of the players then has to choose in his fund the amount that he wishes to send to the other player. This last player then receives a multiple of the amount sent (three times the amount in general) and must decide how much he sends back to the first player. The Nash balance of this game is trivial : the second player is well-advised to keep it all for him, so the first player must not send anything and everyone ends with a final profit equal to the initial funds. However, this situation is overall sub-optimal, since the first player could have increased the total profits of the two players by sending all his funds. Berg, Dickaut and Mc Cabe (1995) noticed, in their experiments, that the players’ choices were far from being in aoccordance with the Nash balance : the subjects send an average of 50% of their funds and obtain in return an average of 1/3 of the amount received by their partner.

Compared with the traditional trust game, one second stage was added after the investment and return decisions of the players, where they have the possibility to give a positive or negative rating on their partner. Three rating systems were tested : a simultaneous rating, a sequential rating (where one of the players is chosen to evaluate first) and a rating system where you can choose between two periods (each player has the possibility to rate immediately or wait). The point of these three processings, to which is added a basic processing (a game with no ratings) is to understand, in a better way, the contribution motivations to a community rating system, by distinguishing the altruistic, selfish and reciproca behaviors. Which rating system generates the highest confidence and thus the highest investments ? Will the subjects tend to evaluate more if they have to do it simultaneously or sequentially ? Will they prefer to wait or evaluate as soon as possible to prompt the others to do so ? The experiments led at the LABEX (University of Rennes 1) give some answers to these questions.

Only few experimental studies exist on these questions, except Keser (2002, 2003). He carried out a series of experiments showing common features with ours. The experiment considers a sequential trust game, followed by a second phase in which only the first player, named A, (who must send part of his fund) has the possibility to rate without any cost positively or negatively his partner, named B, after having examined the sum sent back by player B.

The experiment is conducted over twenty periods, players A meeting at each period a different player B. Keser proposes two alternatives for the rating system. In the first case, player A is informed of the rating received at the previous period by the player B with whom he will enter in relation (partial knowledge of the past). In the second case, he is informed of all the ratings B received in the past (full knowledge of the past). Keser notes more investments and thus more confidence coming from A players when using a reputation system than when not using it and a higher return on investment of the B players with a reputation system than without it (in absolute and in % of the received amount). However, there is no significant difference in the investments and return on investment with a short term reputation system (partial memory) or a long term one (complete memory).

Design of the experiment.

The experiments were undertaken in November 2005 and January 2006, on more than 240 students, at the LABEX (Laboratory of experimentation in social sciences at the University of Rennes 1). See the appendix for the detailed instructions of the experiments. We, first of all, proceeded to a simultaneous trust game without any period of rating (WITHOUT processing). Communities of 10 subjects were created, with 5 A players (investors) and B players (those who receive the investment). The game included 20 rounds and each subject kept the same role during the experiment. At the beginning of each round, each player A is paired with a player B (the pairing is different from one round to another). The two players receive an initial fund of 10. Simultaneously, player A must decide to send an amount ranging between 0 and 10 to B and the player B must choose a return amount ranging between 0 and what he received (the amount sent by A is multiplied by 3), and so for all the different amounts he can receive from player A. At each round, the profit of player A is equal to 10-amount sent +returned amount and the profit of player B is equal to 10 + 3*amount received - returned amount. At the end of the experiment, the profits were put together and converted on the basis of 1 euro for 30 points, to which a fixed participation price of 3 euros was added.

Concerning the trust game, the three other processings (SIM, SEQ-ENDO, SEQ-EXO) are identical to the previous game but a rating stage is added once the sent or returned amount is known. The one who decides to evaluate has the choice to give either a negative rating point (- 1) or a positive rating point (+1). In both cases, he will be charged 1, whereas the recipient of the evaluation won’t have to pay anything. Notice that the negative or positive rating point sent to a player is recorded in his rating grid which will follow him throughout the experiment. This grid includes a rating point history, as well as a score which is the cumulated sum of the positive and negative points obtained in the previous rounds. At each new round this grid is communicated to the participant with whom he is connected. So at each beginning of round, each subject knows his partner’s rating grid and can thus have an idea of his reputation.

Concerning the course of the ratings in the three processings, in the SIM processing the players are solicited at the same time, whereas in the SEQ-EXO processing, the order partners will evaluate each other is indicated by a drawing lots. During an experiment using the SEQ-EXO processing, a subject can thus alternate rounds where he will be solicited first or second. Finally, in the SEQ-ENDO processing, each subject has to choose if he will evaluate first or wait, taken into account that he can evaluate only once during a round. Our experiments are to be distinguished from those of Keser (2002, 2003) on several points, since Keser considers a sequential trust game where only player A can evaluate and so without any cost. The disadvantage in Keser’s experiment is that it is difficult in the case that player B returns few money, to distinguish an opportunist behavior from a retaliation behavior. Our game, from this point of view, enables a better insulation of what comes under opportunism (since the sending and return decisions are simultaneous) and what comes under sanction (a negative rating). Moreover, our four rating systems permit a fine analysis of the rating reasons and a way to compare their effectiveness.

Achieved results

Table 1 presents the average levels of investments of A and average levels of return on investment of B, according to the various experimental processings. Being able to evaluate your partner indeed enables the creation of confidence. Actually, the level of investment of A is on average lower when there is no rating system (2,24). The highest investments take place when the subjects can be evaluated simultaneously (4,36) or sequentially following a predefined order (4,17). The two periods rating system is located between the two previous processings (average investment of 3,32). So it seems that the SEQ-ENDO processing by enabling more sophisticated rating strategies (possibility to wait in order to answer the rating made by the partner, possibility to rate first to incite the partner to rate back) negatively affects (negative externalities of the ratings) the degree of confidence of the players, compared to the two other rating systems.

WITHOUT
SIM|SEQ_EXO|SEQ_ENDO
Investment of player A 2,2 (2,9) 4,4 (3,7) 4,2 (3,5) 3,3 (3,2)
Investment of player B 1.4 (3.2) 4.2 (6.3) 4,0 (5,3) 3,0 ( 4,8)
Return on investment 12% 23% 22% 19%

Table 1. Average levels of investment and return according to the experimental processings (the standard deviations are given between brackets).

Player B’s levels of return on investment are also significantly higher with rating, than without rating. So B sends back on average 1.45 units in the WITHOUT processing compared to the 4,21 in the SIM processing and 3.98 in the SEQ_EXO processing. The possibility of evaluating over two periods (SEQ_ENDO) leads to smaller returns (3) compared to simultaneous or sequential rating systems, but they remain significantly higher than the average level of return with no rating system. Relatively speaking, the B subjects only send back 11,8% of the received amount with no rating system (which means that A receives less than what he invested) whereas, with the ratings, the return rates range between 19,1% and 22,8% (still under the threshold of 33% allowing A to recover the invested amount).

Figures 3 and 4 represent the level of investment of A and the amount sent back by B relatively, on all the periods according to the processing in force. A shared characteristic is the fall of the mean level of investment of A and return of B in the course of time.

Niveau moyen d’investissement de A par traitement au cours du temps.

Average level of investment of player A depending on the processing over time

Niveau moyen d’investissement de B par traitement au cours du temps.

Average level of investment of player B depending on the processing over time

Concerning the evaluating choice, it is pointed ou that a subject is all the more prompted to rate negatively (or positively) the partner that the level of investment or return of his partner is low (or high). Besides, when he evaluates, participant A mainly gives negative points, whereas participant B gives mainly positive rating points. Finally the subjects evaluate more with a sequential rating system than with a simultaneous one. In this way, participant A evaluates in 22% of the cases in the SIM processing and in 24% and 27% of the cases in the SEQ_EXO and SEQ_ENDO processings relatively. These results thus seem to indicate that part of the ratings in the sequential processing are decided according to the investment or return made by the partner, but also in reaction of the rating (or absence of) made by the partner. For instance, in the case of sequential rating systems, an individual who received a negative point of rating in the first stage returns a negative point in 72% of the cases. On the other hand, a subject who received a positive point in the first stage returns a positive point in second stage in 63% of the cases. Moreover, more positive ratings are made in the first stage than negative ones, suggesting that some players attempt to prompt their partner so that he behaves in a reciprocal way.

Finally, these experiments allow a better understanding of the part operated by confidence and reputation in an online community and so offer leads on how to proceed in order to improve the effectiveness of cooperation in these communities.