Final Project: Modeling Mechanical Turk with Incentive Structures for Performance of Human-Intelligence Tasks

Modeling Mechanical Turk with Incentive Structures for Performance of Human-Intelligence Tasks

Frank Chen, Human-Computer Interaction Group, Stanford University,

In recent years, there has been an explosion in the use of crowd-sourcing platforms to solve real problems. These solutions are commonly called artificial-artificial intelligence and the question in harnessing the power of the crowds begin the question: how can problems be decomposed enough to be solved and incentive by crowds. The largest of these platforms is called Amazon’s Mechanical Turk (AMT).

Essentially AMT creates an inexpensive scalable research platform for mediating many of the data-collection and participant recruitment logistical issues. Researchers in human-computer interaction, psychology, and political science have used this platform to perform large-scale information look-up, work and human-computation tasks. Furthermore, many canonical experiments that is appropriate for the context of AMT have been recreated and demonstrated on this platform.Agent-based modeling is framing of computational problems using situated agents in an ecosystem [1]. One can imagine AMT as a particular instance of an ecosystem, where each Turker is an agent. The analogy is appropriate due to the several constraints placed on the knowledge of agents. Agent-based Modeling (ABM) provides a framework to imagine each Turker as intelligent, purposeful, and situated in time & space. We can now use ABM to obseve emergent patterns and decompose the opaque AMT ecosystem to observe the behaviors and the completion of activities by looking at the literature as a basis to construct an agent-based model.

Throughout this paper, we present literature in AMT for the incentives, data-quality, and efficiency. Then we examine how we will create an ABM for AMT, results from this model, and discuss the insights gained from observing the phenomena at the agent fidelity. We conclude with critiques and next steps for this model.


Posted in Final projects, Uncategorized | Leave a comment

Understanding the Dynamics of School Progression

An overview of the complete model

This research is grounded in the observation that grade progression in low income South African schools is only weakly linked to actual ability and learning as a consequence of the inability of (poorly trained, low quality) teachers in these schools to appropriately assess the competency of their learners. The primary purpose of this model is therefore to understand with greater granularity the relationship between student progression, and teacher and social factors. In particular, how does teacher efficacy in assessing student ability, opportunity to learn, levels of unemployment, and pay differentials between skilled and unskilled workers affect students’ choices about school attendance? A key contribution of this model is to explore the relative importance of these in two dimensions: grade repetition and student dropouts. Whilst grade repetition is a function exclusively of ability and teacher assessment efficacy, this outcome feeds into dropout rates, creating a dynamic relationship between these two variables.

The model consists of four significant sequential procedures which occur over a one tick (single year) time frame, with each tick representing a single school year. First student agents are created with an innate ability level (once the model is running, this represents only the addition of a new cohort of Grade 8s), then students write an annual exam that is assessed with some randomness, and they then decide whether or not to remain in school on the basis of this test, their innate ability and various labour market factors. Finally, those students who passed the test and have not dropped out are promoted to the next grade.

Higher stochastic component results in increased dropout rates and fewer students reaching the higher grades

Whilst the average student’s score remains approximately at the mean specified in the model (62%), the greater variability in test scores generated by the introduction of a stochastic component in assessment produces far greater variation of test scores around the mean, and therefore higher probability of students failing. This in turn generates a protracted schooling career. Schooling duration increases as employment prospects worsen (dropouts decrease). Increasingly small class sizes emerge at higher grade levels as the extent of randomness in test mark assignment increases. Notably, an inverted U that appears in total enrolment under the high and average unemployment scenarios. Under favourable labour market conditions (low unemployment), students have little incentive to remain in school following a poor assessment of their ability. However, as labour market conditions worsens, two conflicting forces come into play – as randomness in assessment increases, students are more likely to drop out (therefore we observe smaller classes), however a higher stochastic component also means higher failure rates (and therefore larger class sizes at each grade level). The results suggest that in the fact of high unemployment, the latter prevails, and remaining in school becomes a more successful strategy for students.


Posted in Final projects, Uncategorized | Leave a comment

Participation in a Monolingual Discussion by Native Speakers and Non-Native Speakers

This model simulates participation in a monolingual discussion by native speakers and non-native speakers by operationalizing the assumption that a non-native speaker experiences language anxiety when, on average, the rest of the group is more proficient in the discussion language than him-/herself. Language anxiety lowers the probability of non-native speakers participating in discussions.

The three basic variables that can be manipulated are i) group size, ii) the percentage of non-native speakers, and iii) the mean langauge proficiency of a normal distribution with  0.25, from which a value is drawn for each non-native speaker (relative to 1.00 for native speakers).

The model results show:

1) A moderately less-than-proportionate number of speaking turns taken by non-native speakers;

2) A higher incidence of very quiet speakers among non-native speakers in small to mid-sized groups, relative to among native spakers;

3) the highest incidence of very quiet non-native speakers when they are the minority;

4) more even participation if all speakers attempt to meet a minimum number of speaking turns (pick a value greater than 0 on the “min-turn-req” slider).

Video: Simulation

Proportion of very quiet speakers in each breed, with mean language proficiency of 0.50 among non-native speakers, and by group size

(*NNS-NS difference at 100% is NNS at 100% minus NS at 100%.)

Posted in Final projects | Leave a comment

Final Project: The Impact of Social Proximity on Individual Technology Adoption

Technology adoption model after it has been run.

As innovations, particularly those in the high-tech sphere, are emerging faster than ever, we are constantly feeling pressure to adopt one or another. We use what we know and what we have already embraced, and whether we venture on to try something new varies from individual. Indeed, within one individual, the whether and when we adopt one innovation can greatly vary from whether and when we adopt another. And with so many sources of information and opinion available on so many innovations, we individually determine to which sources we turn and which sources influence us. Friends, family, neighbors, coworkers, acquaintances—our trust in them places them within our personal sphere of influencers. Simply put, when it comes to technology adoption, we may not be as much creatures of habit as we are creatures of our personal opinions combined with our social positions.

I developed model to explore technology adoption patterns. My goal in creating my technology adoption model was to represent how people overcome the tendency to hold on to old technology, and try new technological tools and methods. In particular, I worked to isolate the impact of personal attitudes toward a technology and social proximity in my exploration. Regarding social proximity, I sought to explore the effect of having stronger ties versus weaker ties, and more ties versus fewer ties in our individual networks, on individual adoption behavior. Regarding personal attitudes to a technology, my goal for this model was to differentiate individuals as different types of adopters generally, and even as different types of adopters with regard to specific technologies. One person’s attitude toward adopting Twitter will differ from another’s attitude toward adopting Twitter. And our individual attitudes toward adopting one technology can easily vary from our individual attitudes toward adopting another. The cell phone is not the Twitter, and Twitter is not the Segway. One may have quick to adopt the cell phone, in the mainstream when adopting Twitter, and probably never adopted the Segway at all. In building my model, I sought to illustrate that when it comes to technology adoption, individuals are different in several ways. While the model may seem generic (not crafted for any one particular type of person or technology), it is as actually quite flexible in this regard.

The model is structured so that it focuses on one individual adopter, who is represented by a large white node in the center. The individual adopter has a number of close friends (local nodes) who are represented by smaller green nodes and are connected to the center node by links, or strong ties. They are located relatively close to the center node. The individual adopter has a number of acquaintances (extended nodes) who are represented by smaller red nodes and are connected to the center node by links, or weak ties. They are located relatively far from the center node. The user uses the sliders to the left to adjust the individual’s adopter type, the number of friends, and the number of acquaintances that the adopter has. Metrics regarding the times passed (in ticks), and the number of friend adopters, acquaintance adopters, and total adopters are tracked in the bottom left as the model runs. The total number of adopters is plotted in the top plot on the right; the individual adopter’s progress towards adopting the technology is plotted in the second plot. As agents adopt the technology, they become yellow nodes.


Posted in Final projects | Leave a comment

Final Project – Petri, a Simulation of Bacterial Growth

“Petri” is an agent-based model of bacterial growth built in NetLogo. Bacteria colony growth is a good candidate for agent-based modeling because bacteria are relatively homogenous and their individual dynamics are relatively well-understood. Microbiologists studying the growth of multicellular organisms and colonies may want to know the reason for a particular colony-level phenomenon, and agent-based models allow them to test whether it is emergent from individual cell properties. In contrast, system dynamics models of population growth can only describe the macro-level changes in population over time.

Using photographs of real bacteria in a Petri dish, I extracted the pattern of the bacteria after five days of growth into NetLogo with the import-pcolors feature. I also obtained photographs of the same bacteria after their growth in the dish stagnated several days later. My goal was to find the parameters that lead to the most accurate growth pattern, and to judge the real-life plausibility of the most successful parameters.

The simulation environment is divided up into squares called patches, and time moves forward in discrete time steps called ticks. Each patch can be empty, part of a wall, or occupied with a living or dead cell. Each patch also “owns” a variable for its age, if it is alive, and for the amount of food that it carries. The sum of these variables constitutes the state of a patch. At each tick of time, each living patch runs through a set of rules that determines its own future state and the state of the patches around it.

In simplified form, the rules are:

ask live cells –
–age + 1
–check for surrounding food
–if not enough food, die
–else eat food (distributed over surroundings, appetite age-dependent)
–if old enough, spawn new cell into one available neighbor patch (spawn rate dependent on available food)
–subtract toxicity cost from surrounding food squares

I found that Petri produced emergent bacteria growth behavior that resembles real bacterial growth. The starting bacteria population matures to reproductive age, grows exponentially, reaches an equilibrium of growth and death rates, and then gradually dies off. I found that the toxicity cost and variable reproduction rate mechanisms were instrumental in preventing unchecked growth. I also ran a BehaviorSpace exploration of the model parameters of toxicity, eating radius, and appetite, and found that the best-fit models relative to the real bacteria had low toxicity and similar appetite and eating radius values (both high or both low). I hypothesized that these had counterbalancing effects in preventing the bacteria from eating too much or too little food for stable growth.

If you are interested in learning more about Petri, fee free to email me at!

Posted in Final projects | Leave a comment

Week 9 Reading Response

The ideas discussed in both  Maroulis, Guimerá, Petry, Stringer, Gomez, Amaral, and Wilensky and Buchanan stress how ABM simulations can add to our understanding of social processes. They also (much more pointedly discussed in Buchanan) give rise to an idea that ABMs are not widely accepted in the academic community, especially among econometricians. For instance, Buchanan indicates that the NASDAQ implements new strategies after testing them though ABMs, while the SEC refuses to do the same. But what causes the unwillingness to accept a new style of modeling?

Most obviously, (as behaviorists, have shown in many psychology experiments) people are in general resistant to change, even if it takes you in a profitable direction. The problem with this though is the skepticism lends encouragement to discount models without thinking how they could be improved, either better calibrated or better tests to measure the effects of the model. If instead people looked at ABMs and the potential for increased understanding of social systems they bring to the table and concentrated on the best situations in which to use them and ways to learn from them, then they might move from the fringes into more mainstream use among academics.


Posted in Uncategorized | Leave a comment

Autumn: Week 9 Reaction

In “Complex Systems View of Educational Policy Research,” I thought the discussion of locale and policy implementation was interesting. The example of California mandating state-wide class size reductions based “partly on the basis of results of a randomized fi eld experiment in Tennessee,” encompassed micro- and macro-effects on a few levels. First, the Tennessee experiment was randomized, so some of the variability that occurs among student bodies was eliminated, something that was probably not true of the schools in California. Second, that small sample was taken and the state tried to replicate it with a population that was probably different in that it was not in Tennessee and also that it was not random. One better intermediate step between the experiment and the California regulation would have been to try to at least replicate the experiment multiple times in different samples throughout California.

In “Meltdown modeling,” I enjoyed the discussion of the adoption of ABM within the finance industry and the industry preference for econometrics. The discussion of familiarity was interesting, and seemed to mimic technology adoption curves. Industries tend to be slow to adopt innovations because of the higher costs associated with the risk of deviating from industry norms. But when I put myself in the position of decision-makers, my question was, “Why wouldn’t I at least try to run ABM alongside with my standard mathematical models?” Running to types of models would allow industry leaders to evaluate the effectiveness of both types of models side by side in the same situation. Instead, many seem incredibly ambivalent and inert. I think that’s to the disadvantage of those companies, but should ABM really take off, they’ll just be caught flat footed on the wrong side of the adoption curve.


Posted in Week 9 Reaction | Leave a comment

Lorraine – Week 9 Reading Reaction

“Complex Systems View of Educational Policy Research” by Maroulis et al.: I am interested in precisely how ABM can address “questions pertaining to the paths between equilibrium points (of computational general equilibrium), such as whether a transition to choice might make a system worse before it gets better and for how long and for whom it is worse.” In reality, most likely the initiative may be stopped in light of the initial failure. The value of ABM will be underscored if the status change is something that ABM but not other methods is able to predict.

“Meltdown Modeling” by Mark Buchanan: Using ABM to model meltdown is a great idea. Volatility is a direct result of the expectations of market participants, making an agent-based approach particularly suitable. I can envision synergy with behavioral economics. The biggest challenge is that the market has expand to an extent that may be difficult to avoid overparameterization. For example, you cannot examine the the stock market without considering the credit market. Is it possible for a model to not be retroactive only (e.g., had I known the U.S. subprime bubble could bring down the global financial system)? There is also a huge variety of firms and instruments with different purposes, and obviously participants with different knowledge and views.

Posted in Uncategorized | Leave a comment

Week 9 reading reaction: Margaret

Over the past few weeks, I have come to increasingly appreciate the potential use of ABMs in creating a forum for interdisciplinary research. Whilst it is often the case that the models produced by distinct disciplines would be more accurate (and therefore more useful) if their assumptions were informed by perspectives from a greater diversity of disciplines, researchers often lack a common language to share insights about social phenomena. ABM presents a potential means to overcome this – as has been emphasized throughout this course, a key benefit of this technique is that it is often relatively simple for an observer to understand the dynamics of how agents interact within a model. Therefore, as Buchanan suggests, it is not only the case that the nature of ABM might present a more appropriate technical approach for exploring policy-relevant phenomena, but also that ABM increases the ease with which dialogue may be conducted between researchers from vastly different academic backgrounds, thereby strengthening the basis on which models are designed. Maroulis et al raise a related issue: they suggest that researchers exploring micro level issues often do not fully engage with the macro level implications of their research. This is problematic from the perspective of policy: moving a system between different equilibrium states needs to be carefully thought through in order to anticipate unexpected consequences. Again, ABM provides a potential solution as a mechanism through which micro level findings can be scaled up to understand broader social consequences. Rather than an alternative to current methods, ABM serves as a powerful complement to existing research.

Posted in Reading reactions, Week 9 Reaction | Leave a comment

Frank: Week 9 Readings

Klugl et al. studies Multi-agent modeling in comparison to other modeling formats such as Queueing networks and petri-net, two common types of modeling. Queueing Networks are great for performance analysis while looking at the initial conditions. Petri-Net is a system modeling tool for looking at different forms of systems with concurrent systems. The conclusion from this paper is potentially the most intriguing portion, as it analyzes situations where each of the formal, concurrent, and agent-based modeling has advantages and disadvantages. To the model, it should be apparent that simulating using one versus another tool has by definition different merits to look at different parts of the model. Klugl made a point that elements with continuous connections may be better developed using fixed input-output methods.Maroulis et al presents an archetype of mixed model and mechanism based approaches elaborated upon by Klugl et al. They called for designs taht examine micro-level and macro-level data in examining relationships of successes and failures within the educational policies and its effect on students. The Nature article presents an interesting case of how Agent-based models in the real world. Essentially the scenario pictured has government agencies with dashboards of potentially entangled agents representing hedge funds, banks, and governments. The article acknowledges this type of control would most likely never be possible due to the scale of the endeavor but it brings to mind what could be possible with ABM. A localized version of ABM that does not dictate orders to hedge funds and governments, but perhaps local power companies to look at usage of electricity in different districts based upon ABM.

Posted in Uncategorized | Leave a comment