Preparing for a Data Science Career in Agriculture with a Master’s Degree

Across the American Midwest, as they have done for decades, massive combine harvesters move across the Plains– fields of corn, wheat, and sorghum producing a gross annual output of nearly $375 billion. But what is unseen is arguably more valuable—invisible streams of wireless data rising through cellular networks into the cloud, powering the powerful analytical systems that now decide when, where, and how to sew the fields and harvest the crops they produce.

For as long as the technology has been available, farm equipment has been equipped with:

  • Global Positioning System (GPS) trackers
  • Soil moisture and mineral content sensors
  • Seed spacing detectors
  • Harvest quantity measurement systems
  • Fertilizer spread analysis systems

Recent innovations in data science, such as the cellular/cloud collection system pioneered by Chicago’s 640 Labs, are increasing both the volume and availability of that information. After the harvesters have gone back to the barn for the night, embedded sensor networks continue to whisper data into the ether. Drones pass over the open fields, recording video. Further above, satellite networks beam down infrared imagery of croplands revealing more than meets the naked eye.

According to a 2015 article in Fortune, 151 tech startups were funded in 2014 with capital to the tune of $976 million, all focused on agriculture and food production. Major agribusiness conglomerates like Monsanto, Dupont, and Archer Daniels Midland are all investing in agriculture data science programs.

As America’s multi-billion dollar Big Ag industry and the Big Data start-ups that have sprung up around it look for new ways to put this data to work, skilled data scientists educated at the graduate level are becoming the agricultural soothsayers of the 21st century.

Malthusian Nightmare Averted: Feeding the World One Byte at a Time

The agriculture industry is the unsung hero of the modern world—and, in fact, the only reason there is a modern world as we know it.

In an early, if inaccurate, exercise of elementary analysis of arable land, the Reverend Thomas Malthus calculated in 1798 that population growth would eclipse the agricultural production capacity of the world and result in catastrophe as the masses ran out of food.

Malthus, of course, failed to account for advances in farming science and technique– advances driven by more forward-looking scientists who have been able to increase crop yield and reduce blights and other agricultural disasters through a more cool assessment of data.

Data-Driven Precision Agriculture

Today, despite distribution problems that still result in crises of hunger in much of the third world, the world’s farmers actually produce more than enough food to keep up with the planet’s growing population.

They call it precision agriculture: increasing efficiency and productivity through scientific approaches to planting, fertilizing, and harvesting. According to Brand Niemann, a former data scientist with the Environmental Protection Agency, the next evolution in precision agriculture will be through data science.

The United States Department of Agriculture (USDA) has brought the vast resources of government to the party with their Open Data Initiative (OpenAg) which publishes more than 500 data sets for public consumption. OpenAg aims to integrate government data sets with data sets made public by private agribusiness, combining information to power research that will benefit everyone.

The goal, eventually, is to put the output of all that data onto a farmer’s smartphone, telling him exactly how to manage his crops that day for maximum yield. Monsanto’s Chief Technology Officer, Robert Fraley, believes that by 2050 production will ramp up enough to meet even growing needs with even less land than is being cultivated today.

From Ye Olde Farmer’s Almanac to Data-Driven Predictive Modeling in the Hour of Climate Chaos

Historically, weather has always been the overriding factor for farmers trying to bring in a crop. Even the advent of irrigation and the development of mechanical systems to disrupt frost and freezing patterns have not wholly removed farmers from operating at the mercy of nature. It seems the implacable forces of atmospheric weather systems and the droughts and deluges they bring can shred any obstacles us puny humans try to throw in their path.

Farmers of every era practiced rudimentary forms of long-term data analysis on climate trends, and they tended to settle and cultivate in regions where the numbers played out in their favor.

With the climactic changes being wrought by anthropogenic global climate change, though, the old patterns will not hold. The advantage that data scientists can offer over traditional predictive models (a la Ye Olde Farmer’s Almanac) is that their data sets can extend far beyond human memory, and even human existence, and tap into records inscribed in the stones across the geologic time scale. Data from previous eons, when the Earth underwent similar changes, can be factored into models and used to provide more accurate predictions for the agriculture industry to use for planning.

Genomic Research in Big Ag: Not Just Round-Up Ready, but Ready for Anything

Where and how to plant crops aren’t the only decisions that data science seeks to influence. Increasingly, the crops themselves are being altered by data analysis before they even go in the ground.

It won’t escape the notice of anyone who took high school biology that the father of modern genomic research, Gregor Mendel, made his breakthroughs in that field in the process of studying plant hybridization. In fact, it would take nearly forty years for the significance of Mendel’s pioneering genetic research to be fully recognized. Prior to that, his studies were thought to be about nothing more than the hybridization of pea plants.

Although the methods used today would be unrecognizable to Mendel, the basis of agribusiness’ genetically modified organism (GMO) efforts would be understood instantly as a continuation of ancient efforts to breed plant and animal strains more compatible with the environment in which they are grown and to the uses for which they are raised.

Sequencing crop genomes has become almost trivial. A complete genome can be processed for around $1000, and, soon, will be automated to a point where it will happen in only hours.

Analyzing all that data is another story. Years and years of research can go into studying the genome of a single strain of crop; years and years more might be spent tweaking the base pairs to develop new traits valuable to farmers.

Genetic research doesn’t just go into manipulating genes for the purpose of pest and weed control, which so far has been the chief driver behind GMOs including Monsanto’s much-maligned suite of Round-Up ready corn and soybean seeds.

The information emerging from genome sequences tells farmers how their seeds will respond to things like disease and drought, informing their decisions in real time on how and when to intervene – a concept that would’ve been magical even to the water diviners of the 20th century.

Back to Top