joelb123.github.io

Current Opinions of Joel Berendzen

View the Project on GitHub

Current Opinions of Joel Berendzen (mid 2022)

DNA/tree

“Follow the data!” was the dictum my advisor, Hans Frauenfelder gave me more than 30 years ago as I was thinking about what to do post-PhD. At the time Hans said that, biology wasn’t especially data-rich, but now the situation has changed. For example, the Webb Space Telescope is expected to produce around 200 TB of data per year. A premier biological sequence observatory, the Broad Institute, has been producing that has been producing that much data per month for the last few years.

Moreover, the number of labs with sequencers in them is a lot larger—and growing faster–than the number of labs with telescopes. It’s not just sequencers, either; there are sizeable data flows from protein crystallography at synchrotron beamlines worldwide, and there’s about to be huge streams coming out of microscopy-driven projects such as the Human Brain Mapping Initiative. Biology has quietly become the most data-intensive science of the Age of Big Data.

Here’s an essay on creating the Theory of Biology by building bridges among data, signatures, models, and applications.

Much of my recent work has been on genomic sequences. Here are some thoughts on bioinformatics and bioinformaticians.

I create software to analyze data, and I try to write for the future as well as to solve particular problems today. If I do my part well, my efforts are to be a model and example, not just a means to an end. Here are my thoughts on writing scalable software.

I have published roughly 50 papers with over 13,000 citations that explore the interrelations among sequences, structures, gene-family trees, dynamics, and hydration.

Here is an overview of some of my code repositories and other places where I’ve contributed:

You can comment on this page or reach me on Twitter.