The Official List of Dead Cliches in Reaching Audiences of real People (O.L.D.C.R.A.P.) about Genomics

I like doing outreach. I like talking to non-scientists about science, but I’m a relative newbie in science, so the powers-that-be rarely let me give bigger talks to bigger groups. So I frequently find myself watching more senior scientists give talks to non-scientist audiences.

And I am sick of seeing the same trite, tired, ineffective, cliched slides and hearing the same words from each of them.

Henceforth, I propose that we, as concerned scientists, kill and bury the following cliches. This list will never be complete, and I welcome proposals for additions:

1) The Moore’s law slide.

Yes, the cost of sequencing DNA has gone down. It’s gone down fast. It’s gone down faster than it “should.” Why in the world do we care? This ain’t no economics lecture. It used to matter because of the mythical $1000 genome and how close we were to that. We hit it; there were no parades. How about we instead show the rate of production of DNA sequence in hard drives? Or terabytes? Or petabytes? As an aside, the government is good at some things, and quality slide aesthetics seems to not be one of them.

2) “Junk DNA.”
Call it the Jason Voorhees of genomics metaphors for the number of times it’s been killed and revived only to be killed again. Whoever decided that, just because we didn’t know what this DNA was doing, it meant it was junk ought to feel shame. So very much shame. Your mother gave you this DNA, and it was important to her that you have it, so it should be important to you. People who call it “junk DNA” don’t love their mothers.

3) “Dark Matter of the Genome.”

TESTING

Photo Credit to @ErinPodolak


This is tied up in the recently departed junk DNA accusation. As “junk DNA” gets zombified, a new term has cropped up to admit that, okay, it’s not junk, but we don’t know what it does. Biologists have borrowed a highly defined term from cosmology to describe pieces of the genome with unknown function but known presence. It’s at least better than “junk DNA,” so it’s more like hating your great aunt than hating your mother.

4) The GWAS SNP slide.
GWAS_2012-12
There. Is. No. Information. In. This. Presentation. It used to be cute when you could see that you’re pointing out stuff on chromosomes. It used to be cute when there were only a couple colors and a couple regions you cared about. It used to be cute before Calibri. Now, it looks like someone spilled sprinkles on their mom’s team-building memos. What’s the point? There are exact counts of SNPs. There are bar charts that show how many pieces of DNA are correlated with a disease. If you need to rely on a confusing slide to express that genomics is confusing, you miiiight be a scientist.

Okay, end rant. Who’s got more OLDCRAP for me to hate on?

Posted in Musings | Leave a comment

Asari Sex & Genes 101: How We Do It Like Liara

Copyright ideonexus under Creative Commons Attribution License

Copyright ideonexus under Creative Commons Attribution License

The Asari are near the top of list of uber-powerful galactic babes, alongside Leia and Twi’leks. Yet, if you’ve played Mass Effect, you know that they’ve got a peculiar way of ensuring the stable beauty of their species. Asari can and do mate with many other species, but the offspring is always wholly Asari and pretty much always gorgeous.

In human-human mating, half of your genes come from your mom and half from your dad. In humans and many other species, this halving is accomplished by chromosomes– big chunks of DNA that break the encyclopedia set of all your genes into 23 pairs of books. Half of these chromosomes come from the sperm cell and half come from the egg cell. The cell that results when the sperm meets the egg then has two copies of each chromosome.4.0.4

Asari, supposing they also use DNA for their parent-to-offspring inheritance, pass on their genes in a slightly different way. When an Asari and a member of another species do it, apparently all of the DNA containing all the genes come from the Asari parent and none from the other one.

Wait, doesn’t this make each Asari an exact genetic duplicate of its mother? Wouldn’t that mean there are gobs of Asari clones walking around? Since there is some variation between Asari individuals, there must be something different than this.

It turns out that one of the two copies of the genes gets “shuffled.” So each Asari daughter gets one, unshuffled set of chromosomes from her mom; the other chromosome copy has source DNA material from her mom, but it gets shuffled. The shuffling happens based on the thought patterns of the mate.

So this gene “shuffling,” does that happen in human mating? We learn in school that you get one of each chromosome from mom and one from dad, but do you end up with exactly the same chromosomes your parents have? Nope!
Meiosis_overview

When sperm cells and egg cells form– a process called meiosis– there is something like Asari shuffling. In humans, we don’t shuffle one copy of each chromosome’s genes by thought patterns, but instead there is a shuffling-like step called “crossing over.”

During meiosis, there is a point where either your mom’s cell or your dad’s cell divvy up chromosomes between two egg or sperm cells. (Each egg or sperm is supposed to have only one copy of the chromosomes, so we *have to* split the chromosomes.) It turns out you can take chunks of a given chromosome along with chunks of the one its paired with and make a combined chromosome. You’re not getting exactly the same chromosome that your grandparents had or your parents had. It’s not exactly like Asari shuffling, but it keeps our diversity spicy.

When the chromosome pairs don’t separate correctly in sex cells, we get disorders like Down syndrome and Klinefelter syndrome. These are caused by the fertilized egg having the wrong number of chromosomes and giving all other cell types too many copies of certain chromosomes. This makes me wonder: do Asari always get the right number of chromosomes? Does that mean that there is no possibility of these types of disorders happening among Asari? Does anyone Citadel-centric know?

Reference: http://masseffect.wikia.com/wiki/Asari

Posted in Science 101 | Leave a comment

Things I Learned at #scio14 – Building the Network

Scio14_Together_big

This above all, know thy audience. It is amazing how frequently these words were uttered, have to be uttered, and should be uttered to all who communicate. Beyond this, the most important thing I got out of Scio14 was the community, the network.

I went to my first ScienceOnline event in the form of the dearly departed scioBeantown. This group, alas, was destined for failure because hashtags-plus-nicknames-longer-than-city-name doesn’t work. Through this group, I realized the potential for awesome in the ScioX meeting and vowed to go. I refreshed the registration page gobs as soon as it was scheduled to open to guarantee a spot for myself. (As an aside, we [read: Erin Podolak and Haley Bridger] rebranded as scioBoston and will start kicking tail as soon as the tundra thaws.)

The chronological first thing I learned at Scio14 was that no one’s a vegetarian because there are irremovable numbers of tardigrades in pretty much all the plants. Thanks, Meg Lowman.

I learned that we need to form a roving band of friendly scientists to put their faces in front of people who treat scientists like endangered species. I had a great talk with Emily Finke to this effect and really want to rope in AAAS/@MeetAScientist. Let’s Howard Dean this stuff.

Dave Wescott‘s self PR (#scioSelfPR) talk might have been my favorite. I am a great follower of instructions, and his were salient and succinct: know your audience, get where they are, tell them what they want to hear.

I learned that there is a man named Todd whose job it is to control bird populations at LAX via Justin Kiggins.

I learned that, dammit, I need to make swag with my Twitter handle on it.

I learned that there are two rules governing tweeting something and understanding it:
1) If you can’t tweet it, you don’t understand it. ~Janet Stemwedel
2) Even if you can tweet it, it doesn’t mean you understand it. ~Pascale Lane

I learned that PowerPoint karaoke will DEFINITELY be atop my suggested events for the institute retreat.

I learned that there are many hats and varying numbers of people to wear them in the world of “helping useful people help others.” Mentors, sponsors, advocates, cheerleaders all describe some subset of these people, but we lack a Venn diagram describing their overlap. The conversation should continue on #scioMentor

I learned that there is power in this community. I covered my feelings of distraction previously, but it was so powerful to see important problems addressed head-on by those most affected.

I learned that, if I must depart this world sometime, I would prefer to be drowned in coconut cream.

The Network:
I wanted to capture the connections made at Scio14, and I am a computer geek. I decided to parse Twitter hashtags for who used them and create a network based on the ScienceOnline logo.

1) To parse Twitter, I installed the R package twitteR.
2) I found the most recent 500 tweets using each of the session hashtags.
3) I used igraph to plot the edge list where each edge/line connects a tweeter to a hashtag.

The output image is a PDF and can be searched by your username. There’s a certain amount of randomness in the layout, so apologies if you’re behind someone– nothing nefarious there. I myself am behind the #scioHope node.

The Code:
library(twitteR)
library(igraph)

### Convince Twitter you’re legit
reqURL <- “https://api.twitter.com/oauth/request_token”
accessURL <- “https://api.twitter.com/oauth/access_token”
authURL <- “https://api.twitter.com/oauth/authorize”
consumerKey <- “c0nsum3rk3y” ### Replace with your own OAuth data
consumerSecret <- “c0nsum3r5ecr37″ ### Replace with your own OAuth data
twitCred consumerSecret=consumerSecret,
requestURL=reqURL,
accessURL=accessURL,
authURL=authURL)
twitCred$handshake()
registerTwitterOAuth(twitCred)

### Hashtags to retrieve
toGet<-c(“scioSafe”, “scioMentor”, “scioWomen”, “scioWiseup”, “scioBoundaries”, “scioLaw”, “scioCritSci”, “sciSciLit”, “scioSciBiz”, “scioLang”, “scioDesign”, “scioAlt”, “scioDiversity”, “scioTools”, “scioSciAll”, “scioSociety”, “scioCollab”, “scioProcess”, “scioSelfPR”, “scioReview”, “scioTradLit”, “scioHope”, “scioPsych”, “scioPress”,”scioBigSci”, “scioBeyond”, “scioAble”, “scioNews”, “scioBeltway”, “scioSciComm”, “scioMOOC”, “scioSuccess”, “scioLive”, “scioResearch”, “scioVirtual”, “scioParasite”, “scioImprove”, “scioVidBrand”, “scioVisual”, “scioDigLit”, “scioJSE”, “scioBlogNet”, “scioPlatforms”, “scioComments”, “scioMentor”, “scioBootstrap”, “scioSchoolTools”, “scioUncertainty”, “scioCommunity”, “scioStandards”, “scioImagine”, “scioEthics”)

edges<-matrix(nrow=0, ncol=2)

### Retrieve 500 tweets per hashtag
for(i in 1:length(toGet)) {
tweets<-searchTwitter(paste(“#”, toGet[i], sep=”"), n=500)
df <- do.call(“rbind”, lapply(tweets, as.data.frame))

newEdges<-matrix(ncol=2,nrow=length(unique(df$screenName)))
newEdges[,1]<-unique(df$screenName)
newEdges[,2]<-toGet[i]

edges<-rbind(edges, newEdges)
Sys.sleep(30) # because Twitter doesn’t like it without
}

### Do some prettying up
theGraph<-graph.edgelist(edges)
V(theGraph)$color<-”orange”
V(theGraph)$frame.color<-”orange” V(theGraph)$size=2 V(theGraph)$vertex.label.color=”black” V(theGraph)[degree(theGraph)>40]$color<-”black” V(theGraph)[degree(theGraph)>40]$frame.color=”black”
V(theGraph)[degree(theGraph)>40]$size=5
V(theGraph)[degree(theGraph)>40]$label.color=”orange”

plot(theGraph, vertex.label.cex=.05, edge.width=0.001, edge.arrow.mode=0, layout=layout.kamada.kawai)

Posted in Uncategorized | Leave a comment

My Offline Contribution to #scioSelfPR

Hi, I’m TSS, my website is TheSnarkyScientist.com, and I am a previous karaoke contest winner.

My audience is grown-ups who vote and who had their last science class years ago. People get busy, so I can forgive them for not knowing what a gene is.

My question is a bit self-deprecating: I’m not as good a writer as a lot of the people in this room right now. I have found my strength is more in face-to-face communication. How can I find ways to get my target audience in front of me or in an AV situation where I get to talk to them and it isn’t a classroom.

Posted in Uncategorized | 2 Comments

On The Lingering Issue of Harassment at #scio14

I am a relative newbie to the ScienceOnline community. I have partaken in ScioBeantown in its previous iteration and hope to continue as part of ScioBoston. Through this group, I have met some wonderful people that are hurting right now. My status as a newbie affords me less of the in-community baggage that’s accumulated. That I am unabashed affords me the opportunity to be blunt.

The first breakout session of Scio14 had selections focused on women in science and ways to be an ally to all “minorities.” According to the stream, attendance at these sessions was highly slanted toward women. Apparently the males (myself included, hypocrisy noted) chose poorly and went to a less-weighty law session. Following the tweets from the other session (#sciowomen and #scioboundaries) made me feel horrible for not attending the other sessions, and not being an active contributor to addressing and solving this problem.

Why were there so many sessions focusing on women in science and harassment? Well, Bora Zivkovich, a godfather of science communication, was an extensive, repeat [alleged (added 3/2)] harasser of women. This has been covered elsewhere, and, Dear Newbies, a google search will give you the lay of the land. This led to a conceptualization of “ripples of doubt” wherein female communicators couldn’t separate his favoritism and assistance from their own ability to succeed without his help. Tl;dr, A guy named Bora was powerful and [allegedly (added 3/2)] harassed a bunch of female science communicators.

This elephant in the room needed to be addressed here, because Bora was a big part of ScienceOnline. This led to the session selection. But this wasn’t enough. Friends I’ve made early in this community went to the sessions that tap danced around the subject of Bora [allegedly (added 3/2)] harassing members of the community. Friends were left shaken by the fact that these sessions covered nothing substantially and didn’t really address Bora by name.

This leaves me with a question: why bother having the sessions on harassment and inclusion if 1) the right people didn’t attend and 2) if you aren’t going to address the actual issue.

This matter is hanging like a fart in the air over the conference and needs to be addressed head-on. It should have been the first thing mentioned in a plenary session. The hard truth should not be opted into. It is the only way to build community with newbies like myself. They need to know how we got to where we are today. We all need to know what to look for. And, above all others, we all need to know how to be allies to each other.

Posted in Uncategorized | 10 Comments

How and Why Journals Should Respond To Retractions

Litmus_paper

This picture is of a litmus test. It’s a paper strip with various dyes on it that can be used to detect the pH (acidity or alkalinity) of a liquid.

According to two papers that were published late January, if you take cells that have committed to being a certain cell type and put them in a certain pH environment for a certain time, these cells behave as though they haven’t made that decision. That is, a cell that has committed to being a blood cell can be coaxed back to being a stem cell.

This did not pass my litmus test for how I understand biology to work.

Imagine my complete lack of surprise when irregularities were reported in the papers.

When the papers originally came out, I was in the middle of taking an NIH-mandated course on research ethics. A lot of the subject matter centered on publishing good research and being a good academic citizen. A lot of the punishment for not doing these things was “oh, you’ll have to retract your papers.” (As a side note, I don’t know if these papers will end up being retracted, nor is it what I’m arguing for at this point.)

In academic circles, having your paper retracted is an embarrassing issue. You only do this if you’re proven wrong in your findings or methods. Since your paper was published (root word “public”), this has to be done in the public eye.

Science is quality-controlled by peer review. This is the process by which fellow scientists read your paper and double-check that you didn’t get anything wrong or misrepresent your results. There’s another party involved in peer review that scientists tend to forget about: the journal itself. In negotiations to get research article published, journals are represented by editors that solicit peer reviews from other masters of the field, since they themselves cannot be masters of all paper subjects. The editor then reads the reviews and gives a final decision about whether the paper gets to be published in the journal.

What happens to the journal when someone’s paper gets retracted? At worst, nothing; at best, double the hits on their website for those checking the article and then the retraction reason. Nature isn’t going to take it on the chin for, if the allegations hold, publishing bunk science. No, they’re going to publicize the investigation and make a big deal about taking down this researcher’s career– not that I’m defending publishing bunk science.

When a journal publishes a scientific paper, they have chosen to publicize a certain finding over all the other findings they could have. There is no one holding journals to a standard that this science be right. Journals are in the business of selling ads and magazines, not in the business of publishing good science. Oftentimes, the best science IS the most engaging and interesting article. Oftentimes, it is not. The editor is responsible for weighing the quality of the science (as analyzed by peer review) with the sexiness factor that will move the journal’s product.

Journals have to worry about their bottom line, and this bottom line is not always aligned with good science. So they paradoxically can benefit from publishing total crap science. Thus, I propose that, if a paper is retracted from a given journal, the underlying peer review process needs to be made public. This disclosure will help the readers of the original article understand how bunk science made it out in the first place. Did the peer reviewers voice these concerns and they were overruled? Did the editor do everything he or she could to address problems with the manuscript?

Why does this matter? Journals like Nature are taste-makers. As soon as these papers were published, I know of at least two other labs a stone’s throw from my own who tried this acid experiment. Think of the time and resources collectively wasted if this result was based on bad science. Keeping up with something interesting enough to be published in Nature is kind of an academic necessity, so of course we’re going to try replicating the results.

My favorite thing about the irregularities in the stem-cell-acid-bath papers coming to light is the timing: it took a mere 2.5 weeks for the papers to be published, get a ton of press, and then be picked apart by post-publication reviewers on the internet. If it only took such a short time for mere interested parties to find the holes, shouldn’t dedicated peer reviewers have picked them out too? Something has clearly gone pear-shaped here.

I really want to know who fails to keep bad science behind the gates. Journals have no legal imperative to do so if it’s going to sell, even if it wastes taxpaper dollars in the form of research funding. Peer reviewers names are usually kept blinded. And the writers of the article are of course going to do everything they can to get into Nature– it’s a big deal to get your paper published there.

TL;DR: because journals of a certain tier have no incentive to keep bad science unpublished, they should make the peer review process for retracted papers public.

And what’s the carrot for the journals? More transparency, invested peer reviewers, and, of course more pageviews.

Posted in Musings | Leave a comment

These biology terms are badly designed. Let’s do better.

2014-02-03_22-29-59_13
I lamented to @themodernscientist that there were so many bad terms in science. We kind of suck at making intuitive names for stuff, so knowing what they mean ends up being rote. Here’s some that I see frequently and that really stink along with what I think we should do about it.

Kinase. This one is entrenched, and I think it has to do a lot with the drugs targeting them. Methylases methylate; acetylases acetylate; kinases… kinate? No, kinases add a phosphate group. Wouldn’t you think there was a name something like “phosphorlyases?” By Jove, there is! But both phosphorylases and kinases are subsest of phosphotransferases (another good option) that have specific *ways* of adding the phosphate group. Yet we focus on this mechanism more than the output. “Phosphotransferase” better describes this, the important part.

Gene. We scientists abuse this one by treating it as a shortcut. Everything in your DNA is a gene– enhancers, promoters, insulators, transposons, everything. Instead, we use “gene” far too frequently to refer only to the 4% of the genome that encodes protein in its DNA. The only thing required to be in the Gene Clubhouse is that it’s inherited. How about “protein code” or steal other computational language like “protein script?”

Amino acid. That they have an amine group and they are acidic are the two least interesting things about protein monomers. They’re the Lego bricks of life! They’re links in chains that make machines out of biology! So much more brilliant than something about their silly chemical structure.

Linkage disequilibrium. No one outside of the genetics field has the slightest clue what this means. It’s such an intuitive, important concept that it should be better termed. Your genes are inherited not individually but in blocks of varying sizes. Linkage disequilibrium, or LD, refers to two spots in DNA that tend to be inherited together in the same block. So, if you get genes A, B, and C from your dad in one chunk of DNA, these genes are “in LD with each other.” How about something like “co-inherited?”

Enhancers and promoters. Enhancers don’t really enhance transcription and promoters don’t really promote transcription. Both refer to regions of DNA that help control transcription of protein-coding genes. Both of these types of regions are bound by proteins that act as signals for when to or not to transcribe. The promoter is generally the first chunk of the protein-coding gene that is transcribed, as well as a little bit upstream of it. How about “initiation site” or “staging site” for this region? If I believed what I was taught in school, I’d think enhancers are only for some genes that need extra-bonus-high transcription levels, so enhancers enhance the expression of these protein-coding genes. Instead, we’re now seeing that enhancers dominate the decisions about when to transcribe genes; they don’t just enhance transcription but cause it in the first place. The field has adopted “transcriptional regulators” for both regions that seem to amplify or quash transcription, and I think “positive” and “negative” would be good additions for this replacement term.

Recombinant DNA.. This one just sounds overly scary, and also falls into the how-the-sausage-is-made camp of unnecessary specificity. It’s hard to “print” a DNA sequence artificially. So, if we want to make, say, a string of genes and put them inside a bacterial plasmid, we take various parts and stitch them together. This is the “recombination” part. I can see how the inventors didn’t want to say “artificial genes,” but can we do better with this one?

Honorable mentions:

  • Cancer as a singular noun. “Cancers.”
  • Anything ending in “-ome.”
  • Cell, which has a cute etymology story, but too many other meanings.

Now, I acknowledge that all of these names have come to be for someone’s idea of a good reason. In some circles it does matter that the kinase uses this process. It just doesn’t matter all the time and almost certainly not enough to be the prevailing name. Rote memorization of outdated, imprecise, confusing names is no one’s friend.

Science prides itself on replacing outdated models and ideas as better ones come along. Can’t we do so with the words we use?

Tell me which bad science terms I missed in the comments.

Posted in Musings | 2 Comments

Things I Hope to Get From #Scio14

2014-01-30_22-52-27_475

I am jazzed for ScienceOnline. I camped out at my computer before registration opened and must have refreshed the page 50 times. I’ve heard how powerful an experience it can be, and I’m geeking out for my first time attending.

Following @PHLane‘s advice, I’ve started planning. So what do I hope to leave Raleigh having gotten?

1) A better network. The last time I was in an area with this high a density of SciCommers was the 2013 AAAS meeting. I quadrupled my follower count and learned almost all the basics of the realm. I also left having met some of the big players and the bests-at-what-they-do. scioBEANTOWN is going to be there in force, but I hope to meet folks from more fields and more coasts and those better at talking about my favorite things.

2) A/V design assistance. As much as I love writing, I know there are better writers than me and that it’s not always the most effective way to get my point across. I’ve been toying with making videos in addition to posting, but I’ve discovered I’m not a master of that space. I’ve signed up for the video pre-production workshop, and this is one of my skill goals for the conference.

3) BBQ. I love me some barbecue. I’ve never been to either Carolina, and I’m told they’re all right at it.

4) Efficiency clues. It must shock and pain you to see this, dear reader, but I’m often too busy to dedicate much time to TSS. I have to learn to be better with what little time I have. I’m in a state when I actually have ideas queued but little time to flesh them out. Everyone going is a busy professional in something, but either because of this or in spite of it, they manage to carve out their outreach time. When they have that time, they’re sagacious on spending it.

5) Inspiration. Some of my communication heroes are going. It will be some seriously rarefied air. I know I shouldn’t be going as a fan, so I’ll suppress my urge to squee, but I will be around people at the pinnacle of communication success. Seeing them do what they do so well will almost assuredly light a fire under my tail and push to improve. If I keep my ears open, I’ll likely figure out how they do it too.

Posted in Musings | Leave a comment

Why our count of protein-coding genes should not necessarily shrink

Sometimes the most obvious, first-pass questions are the hardest ones to answer:
How big is the universe?
What kinds of cells are in the body?
How many genes do humans have?

Yeah, despite the work we’ve done with them, we still don’t have an exact count for how many genes humans have that have the instructions for making proteins. There’s a great and pretty readable article from Mihaela Pertea and Steve Salzberg here that covers the history of gene counting and emphasizes two big ideas in the first two figures:

1) How many genes something has doesn’t mean anything about how complicated or evolved something is. Grapes are not intrinsically more complicated than chickens.
Gene counts in a variety of species

2) We’re still honing in on how many genes humans have.
The trend of humang ene number counts together with human genome-related milestones

During some low-key web browsing, I found out that the number is supposedly shrinking even further. “The shrinking human protein coding complement: are there fewer than 20,000 genes?” is a paper under review at Molecular Biology and Evolution and is written by Ezkurdia et al. I myself found it through an article called Human Genome Shrinks To Only 19,000 genes on Digg.

I have a couple gripes with this paper’s conclusions and one minor gripe with the fact that I’m seeing this paper now.

First, the science:
Ezkurdia et al. use a combination of output from previous papers and their own work using mass spectrometry (MS). They use MS to identify all the proteins in a population of cells in a high-throughput manner. Their idea was to make sure all the genes we think make proteins actually produced proteins– pretty simple concept, but really hard to prove.

Many groups have tried to find where in the genome transcribed mRNAs or translated proteins come from. Ensembl and ENCODE are two such groups, and they tend to take databases of known mRNAs or proteins and computationally scan all the genome sequence for likely sources. ENCODE produced a database called GENCODE using several variations of this method and found 20,345 coding genes (as of writing). The authors in Ezkurdia et al. used an earlier version of this (v12) that had 20,110 protein-coding genes and were able to identify proteins that corresponded to 11,838 (or 11,840 depending on which line you read) of them. This recovered count is in line with other studies, but still pretty surprising– one group finds spots in the genome for 20k protein-coding genes, while another can only find proteins from 12k of them.

So what are my issues?

1) Isoforms. mRNAs are made from pasting together exons. There was not great time spent on describing the pitfalls of this complex subject. Isoforms are what we call two forms of the same protein-coding gene. For instance, a transcribed mRNA might have 3 exons in one version and 4 in another. Infuriatingly, these two versions may or may not produce a protein with the same function. For such a complex subject, that they only mentioned the word “isoform” 9 times is a little questionable.

2) Expression. The second example of an obvious-yet-unanswered question I used was “how many cell types are there?” Wikipedia says about 200, but we don’t have this one pinned down either. Why does this matter here? Well, it turns out that not every gene is transcribed in every cell type, so you wouldn’t necessarily find a protein translated from it. In fact, it is these transcription differences that make the cell types functionally different. Here this means that, unless you find all the proteins in all the cell types, you won’t find all the products of all the genes.

The authors acknowledge this problem several times:

“We found a strong correlation between peptide detection and the number of tissues in which a transcript was expressed.”

Essentially, the more cell types a gene is transcribed in, the more likely it was to be found by the authors. That means if a gene is transcribed and translated in only a few cell types, it’s more likely to be missed by this analysis.

“…large-scale proteomics experiments performed on specific tissues may detect gene products with restricted expression.”

“Restricted expression” here means they’re only transcribed in a few cell types. Thus, the authors predict a better recovery of the GENCODE genes if they look at more data.

“We do not detect any peptide evidence for any of the 380 diverse olfactory receptors annotated in GENCODE 12.”

This is a particular example of genes whose proteins the authors do not detect. Olfactory receptors are fairly well characterized, but they are not picked up in the MS analyses. Without digging very deeply into these analyses, I can’t say for certain, but I wonder if the reason they were not detected is because of the cell types used in analysis; maybe there were no olfactory neuron cells used.

“Conservation and gene age are the best predictors of peptide detection multiple.”

Further evidence of this insufficient-cell-type proposal is that the authors tend to recover proteins whose genes are evolutionarily old. That is, they are shared between humans and very distant relatives including fungi. These proteins tend to have more general functions, so they would be used by many cell types. “Genes are generally widely expressed and often retain important housekeeping roles.”

Finally, “Our evidence suggests that the final number of true protein coding genes in the reference
genome may lie closer to 19,000 than to 20,000.”
I remain unconvinced.

This paper was made available to everyone via the arXiv. Why am I bothered that I’m seeing it now instead of when it’s published in Molecular Biology and Evolution (even if it’s allowed)? Well, for one, no one outside of the lab or friends of the lab have officially reviewed it. Normally, research papers to be published undergo thorough review by anonymized peers (peer review). Putting a draft/submitted version on the arXiv has no such guarantees.

What has happened in this case is a paper with serious flaws has gotten popular because it covers a new way to ask a question that’s obvious to anyone who knows what a gene is. The sexy title doesn’t hurt either.

Posted in Science in the News, Snarkings | Leave a comment

Tis the Friday ‘fore Solstice

Tis the Friday ‘fore Solstice,
and all through the Hub
All the students are scrounging
for road-trip-home grub.

I, as per always,
still linger in lab,
whilst my car soon drives home
to see multi-tone Labs.

It’s tough to switch off
my propensity for snark,
but the Solstice is all
about light in the dark.

As my brain peters out
for this longest of nights,
Happy Solstice to all
and, to all, a safe flight.

Posted in Uncategorized | Leave a comment