Rob Carlson, PhD, is a Managing Director of Planetary Technologies (née Bioeconomy Capital). and an Affiliate Professor in the Allen School of Computer Science & Engineering at the University of Washington. At the broadest level, Carlson is interested in the future role of biology as a human technology. He has worked to develop new biological technologies in both academic and commercial environments, focusing on molecular measurement and microfluidic systems. Carlson has also developed a number of new technical and economic metrics for measuring the progress of biological technologies. He is the author of the book Biology is Technology: The Promise, Peril, and New Business of Engineering Life, published in 2010 by Harvard University Press; it received the PROSE award for the Best Engineering and Technology Book of 2010 and was named to the Best Books of 2010 lists by writers at both The Economist and Foreign Policy. He is a frequent international speaker and has served as an advisor to such diverse organizations as the United Nations, the OECD, the US Government, and companies ranging in size from startups to members of the Fortune 100. Carlson earned a doctorate in Physics from Princeton University in 1997. Carlson, who was a speaker at the PUZZLE X conference in Barcelona Nov, 6-9, recently spoke to The Innovator about why storing data in DNA could be the Next Big Thing.
Q: Why do we need to store data in DNA?
RC: To solve a looming problem. The world is producing data all the time and not just exponentially but hyper exponentially. The technologies we now use to store data won’t be able to keep up with demand for much longer. In ten years, we will be probably facing the end-of -life or maybe even be past the limits of tape. We know we can store enormous amount of data on tape, but the demand is accelerating so fast that the number of tape factories and data centers that we would have to build would be economically implausible and would have an unacceptable impact on the environment. It has not dawned on people yet that we are going to have a storage crisis. Most companies are not thinking in ten-year time frames. I expect what will happen is the price of tape might increase as there is pressure on supply, at which point we will look around a little more actively. So, what are the options? We already have a lot of experience with one storage medium that is stable at room temperature and that lasts for millions of years, and that is DNA. We can read DNA, and we can write data into synthetic DNA, which has the potential to store at least eight orders of magnitude more data than tape, significantly reducing the amount of space and material needed for future archival storage needs. But we need to be able to write DNA much faster than we can today. The problem is we don’t know how long it is going to take to get there and there is likely going to be a diversity of approaches. If there is one thing biology teaches you it is that there is almost always more than one solution. I hope they end up being interoperable. Maybe we need to agree ahead of time on governments only funding work on one standard. We have done that before with chips.
Q: You are credited with the Carlson Curve –the prediction that the doubling time of DNA sequencing technology (measured by cost and performance) would be at least as fast as Moore’s Law, the observation How do the two differ?
RC: DNA and transistors currently play very different roles in the market. The computer in your pocket fits there because it contains orders of magnitude more transistors than a desktop machine did fifteen years ago. Next year, you will want even more transistors in your pocket, or on your wrist, which will give you access to even greater computational power in the cloud. Those transistors are manufactured in facilities now costing tens of billions of dollars each, a trend driven by our evidently insatiable demand for more and more computational power and bandwidth access embedded in every product that we buy. The total market value for transistors has grown for decades precisely because the total number of transistors shipped has climbed even faster than the cost per transistor has fallen.
In contrast, biological manufacturing requires only one copy of the correct DNA sequence to produce billions in value. That DNA may code for just one protein used as a pharmaceutical, or it may code for an entire enzymatic pathway that can produce any molecule now derived from a barrel of petroleum. Prototyping that pathway will require many experiments, and therefore many different versions of genes and genetic pathways. Yet once the final sequence is identified and embedded within a production organism, that sequence will be copied as the organism grows and reproduces, terminating the need for synthetic DNA in manufacturing any given product. This is why the industrial scaling of gene synthesis is completely different than that of semiconductors.
What is needed to make DNA data storage to work is to mash these two technologies together.
Q: What is the state of play?
RC: Microsoft had a project, which I was part of, aimed at building high throughput DNA synthesis chips with small enough features to enable a commercial DNA data storage product, but the company shut it down. I expect that, ultimately, we need to develop a system that uses standard semiconductor logic to control enzymes that synthesize DNA. Doing the R&D and building a product looks to me like it will cost about $100 million, but if we make that investment over a couple of years, we could make real progress.
Q: Why should business care?
RC: We need to hop on this before the storage crisis hits and we will be forced to start forgetting some data. Boeing and GE currently collect vast amounts of data every hour about every flight. This information is valuable for forensic purposes but very difficult to keep due to the volume. That sort of problem is going to hit society at large at some point, especially with the introduction of Generative AI, which is going to ensure that we run out of storage capability that much sooner. I am not seeing a lot of options for adequate data storage capacity other than DNA. Biology is more sophisticated at information processing than anything humans have invented. We are going to learn a lot from biology over the next decades and what I hope is that we can harness it to tackle some of the world’s biggest problems, including the data storage crisis.
To access more of The Innovator’s Interview of The Week articles click here.