What went wrong with the Ketchum Bigfoot DNA study? Haskell Hart has the answers. The chemist has analyzed Ketchum’s data extensively and published two scientific papers on the subject of purported Bigfoot DNA. In this interview Hart explains that Ketchum and her co-authors did not correctly analyze and interpret the data. But he doesn’t rule out that there really was Sasquatch DNA collected for the study and that the creature could be real.
Haskell, can you tell me about yourself? You are a chemist, right? Why do you look at DNA and the Ketchum paper?
Haskell Hart: I started looking into the Ketchum paper because it was scientifically interesting. I am a lifelong amateur naturalist. My PhD is in physical chemistry, but I worked alongside biochemistry students and postdocs also in the W. N. Lipscomb Group at Harvard. I’ve known about DNA and its structure since high school or early college – over 50 years by now.
How did you become involved with Bigfoot research and Melba Ketchum? Did you have a connection to Ketchum prior to the work for your papers?
As a naturalist and chemist I found her paper very interesting. No connection to her prior to my work with her published sequences. We communicated a few times on my results in 2013 and 2014.
After the release of the Ketchum paper Meldrum told me that it is very unusual to not publish the whole raw data together with the article. Did you get access to this raw data for your analysis?
The raw data would be virtually uninterpretable to most readers, but could have been included in supplementary files. You need special software to produce sequences. Subsequently some electropherograms have been posted on Ketchum’s website. She did publish her sequences in alphabet form, which are what I analyzed, but these are results not raw data.
Why are your results so different from Ketchum’s? What were the main mistakes in her analysis?
For two of the three nuDNA sequences (samples 26 and 140): she did not correctly use BLAST and did not correctly interpret her search results. She did not search all the correct databases in GenBank. Her phylotrees for these two are ludicrous: fish, chicken, mice. She never even checked the scientific names or translated them to common names. She had a preconceived notion about her results, evidenced by constant reference to “novel hominin” in her paper.
Sample 26 was the so called Bigfoot steak from the Sierra Nevada that was found by Justin Smeja. Sample 140 is blood found by Stan Courtney on the inside of a damaged downspout in Illinois.
So Ketchum had nine coauthors. Isn’t it strange that none of those has found the mistakes?
Yes, very. They greatly abrogated their responsibilities as coauthors, taking the word of the misguided Fan Zhang, with degrees in mechanical and aeronautical engineering. The Microscopist from Texas A&M University told me he does not speculate in the paranormal, rather only takes pictures. He would not even speculate to me on the origin of the single stranded DNA when I asked him. Very unacceptable and unscientific as far as I am concerned.
Now even if they did realize their mistakes they would probably not own up due to loss of reputation and business. I sent them all an early version of my paper but got no responses.
Am I right in my assessment that any scientist should come to the same conclusions as you? Your analysis is repeatable?
Absolutely, and they do not need to be a geneticist or even a degreed scientist. I devoted three blogs to explaining my methodology. So far nobody has admitted to checking out my searches and results.
Apparently some of the samples match modern humans. For example 31 that was collected by the Erickson Project in Alabama. But you can’t rule out that this sample does come from a sasquatch, right?
Right. If sasquatch is near human a complete nuDNA genome may be required to distinguish it from a modern human. If in fact it is modern human, you may never be able to make the distinction.
The sample 31 nuDNA sequence is only 0.5 M bp long. The whole human genome is approximately 3.3 B bp long, so Ketchum only sequenced about 0.016% of the human genome on only one chromosome (11). In spite of her title claim of “Three Whole Genomes”. She says she has the remaining sequences for these three samples but has never disclosed them or even said what they match or resemble.
There’s the possibility that some samples in the Ketchum study were in fact sasquatch but were not properly tested. Is this assessment right? So would it be possible to do new testing on the samples? Which samples would you like to run new tests on?
Anything is possible. All the tests she performed are destructive, so there may not be additional sample available to test, especially if the original were only a hair or two. In fact I identified the human like samples in Table 1 of my paper in the Relict Hominoid Inquiry.
I suggest more samples to study or to obtain new samples from the same regions in my third, yet unpublished paper for the Journal of Cryptozoology.
Haskell, I’ve also corresponded with Ketchum. She’s doesn’t accept your results. I’d like to quote her and ask you for a response.
The first of her claims is this: “He didn’t use bioinformatics to analyze it like the bioinformaticist at the University of Texas. All mammals share certain sequences and when randomly searched through Genbank, you will come up with other species. You have to assemble and analyze the entire sequence using particular software to get the overall picture. He doesn’t have this software, supercomputer, or the expertise to use it.”
She has no idea what software and computer I have, nor what expertise I possess.
The reason that her sequences are close matches for different mammals is by her design: using a human chromosome 11 reference sequence to assemble nonhuman samples. So, naturally only those parts of the bear and dog genomes which are conserved with human were sequenced, 1-2% of chromosome 11, which has 135 M bp. The sequences for samples 26, 31, and 140 have been assembled by her, and there is no need to redo that, since she published the resulting sequences, which I took at face value. Since they are tiny fractions of the entire genomes, if she has “the entire sequence” for these samples, she should publish and interpret these. Constant reference to unpublished data is not credible.
These statements are smoke and mirrors on her part. In May of 2015 she intimated on Facebook that she was having the raw data reassembled. Nothing further has been reported on this. My work does not involve assembly at all. I compared her assembled sequences (strings of A, T, G, C) with those of known species through free access BLAST software (the same that she used) on the National Institute of Health’s servers, also available free online.
One does not need a supercomputer, additional software, or any special expertise for this. It’s quite simple, as I outlined in my papers and blog. It’s very unlikely that a different assembly would produce a different species result. If the samples 26 and 140 were human-like, they would have sequenced as such with the human reference sequence that she used. My 17 best hits (by score and sequence length) were all bears for sample 26 (see my Journal of Cryptozoology paper). My Figures 1, 2, 3 and 4 in the Relict Hominoid Inquiry paper show that sample 26 is a bear and sample 140 a dog. Ketchum presents no comparable results to prove her conclusions.
Ketchum’s claim #2: “When I first got the raw data, I did the same thing and was mortified to find all kinds of mammals in the genomes. The bioinformaticist had to explain this to me since I hadn’t dealt with whole genomes before, being a forensic scientist.”
The presence of hits for different animals in the results does not mean that they are all present in your sample genome, nor does it mean that you have discovered a new species. You need to sort through the hits, preferably in the downloaded Excel file to find the best hit(s). I did this and consistently bears beat all primates for sample 26 as did dog for sample 140.
Ketchum’s claim #3: “I’ll try to explain to you. Say you have a sequence that is basically a few bases more similar to polar bear than human. A simple BLAST (search) like he did will show this at a high level, perhaps 99% for a short sequence, but within that sequence, there are a few nucleotides that don’t align with polar bear, so actually it is unknown, same with human. If it were bear, it would have been 100% and same with human. They are neither but an unknown hominin. Also, bears share about 60% of their DNA with humans, which means they are closely related and the DNA sequences will be similar, but not exact, just like the BLAST shows.”
All bears of a single species do not match 100%. Nor do all humans. All species have mutations which make the individuals slightly different from one another. Bears and humans are sufficiently different that this would show up in a sequence of any significant length, certainly one of hundreds or thousands of base pairs. (Sykes used only 104 bp to distinguish all the species in his paper). Only 60% similar is very distant on the evolutionary scale. Todd Disotell says we are 39% similar to a banana. Further, polar bears hybridize with brown bears in some geographical regions, so you could have 1-2% variation, as shown by Sykes and especially in the paper that addressed his bear results.
When I accessed black bear data in the literature which was not available in GenBank, I got black bear matches, slightly better than polar bear or panda and much better match than any primates, considering that she used a human reference sequence to sequence a bear sample. Ursus bears have diverged relatively recently, so they are genetically similar.
In a private communication she shared results of her experts with me, including the findings of one who broke up the DNA sequences into average 60 bp fragments before analyzing with BLAST. When he combined his results he got basically the same results as I did, namely only 94-95% match to human for sample 26 and sample 140. Breaking sequences into fragments is definitely not the way to go because you lose information, namely how they are connected. (For example: fragments A, B, C can be combined six different ways to give a whole sequence, but only one is correct). Further, as she mentions, for a “short sequence” you may get very good matches to other species. This was her expert’s approach, not mine. I searched her full published nDNA sequences and got hits that were 1000-2000 bp long for sample 26, plenty long enough to make species calls, even though she biased the results toward human by using a human reference sequence in her assembly.
Ketchum’s claim #4: “Query sub-file files for each genomic contig (26, 31 and 140) were created via a Perl script I wrote. The procedure removed the original fasta header line and any empty lines. The script then created substrings demarcated at newlines along with a new fasta header line for each substring. New substrings were on average 60 bases in length.”
This guy got overall match to human: 94.16% for sample 26, 97.20% for sample 31, and 94.69% for sample 140. They should have noticed this difference. The results for sample 26 and sample 140 closely match mine and are slightly lower for sample 31. 94% is not close enough for a species match or even for two “hominins”, especially when the reference sequence for sequencing the unknown was human.
Ketchum’s claim #5: “He shows his lack of understanding by attempting to publish this. Obviously, these are not walrus samples, yet it shows more walrus than human, same with dog. The reason for all of this similarity is the region assembled was highly conserved in mammals, meaning that all the sequences are similar in mammals. There are no polar bears in California either. It should have come back black bear if it was bear. Not only that, but if it were a bear, the mitochondrial sequence would have species ID’ed as black bear, not human. The same with the other two samples. If they were anything other than an unknown hominin, the species ID, the mitochondrial portion would have shown those species. Law enforcement uses the mitochondrial sequences to prove the species and if the species were anything other than human/hominin, it would have shown it.”
“Attempting”? No, I actually did publish my work in two peer reviewed journals. My peer reviewers thought I understood the problem. Hers did not think she did, as her paper was rejected twice.
Reference to a walrus is likely due to the fact that I found that walrus matched her sample 26 sample better than human, but not as good as bear. If you look at the taxonomic tree of life, you will see that walrus is actually quite near to bear (both are carnivores), much closer than either are to human.
I’m much more aware of the effect of conserved genes on her results than she was, as explained in my papers and blog. I know there are no polar bears in California, but there is relatively little black bear data in the database that she searched, but there are entire polar bear and entire panda genomes in other GenBank databases, which she did not search, according to her paper. Hence other bears will match fairly closely to a black bear sample, much better than a primate will.
My phylotrees match established taxonomy. Hers are nonsense – chicken, mice, bony fish, sharks, and nothing even close to a primate.
Ketchum’s claim #6: “On sample 26, our lab extracted the sample, as did another accredited forensic lab, Both samples were sent for testing and both samples tested human at Family Tree DNA and all the other labs and the labs had no idea what they were testing.”
I have to say here that FTDNA missed the boat by not flagging her samples that have too many mutations to be purely human. Regardless of the haplogroup you should not have more than 4, at most 6 (less than 1% probability of occurring in the natural human population) extra mutations. They should know this. Her sample 26 had 16 extra mutations, so many that it could not be uniquely haplogrouped. FTDNA should have noticed this. They obtained human results because they are a lab that does human DNA testing, and their protocol is specific for human, so they found the human contamination only.
Three other forensic labs obtained black bear for sample 26 with human contamination in much lower concentration for two. They used both black bear and human primers. The third, Sykes’ washed clean the human contamination and used universal primers.
Ketchum’s claim #7: “I never touched the second extraction by the other lab. On the nuclear side, human and unknown. Once again, the nuclear was for specific genes and they didn’t show bear or dog or anything else. This is not our data, but data from accredited labs.”
She didn’t use bear or dog primers, rather human. The results were seen as anomalous, which she calls “unknown”, because the primers did not match the main species, not because she found something new. What human she did find was very likely contamination or conserved, since it was not consistently human. You should not get “human and unknown” in a sample with a single species. Even a human hybrid would have shown more human like testing than many of her samples did.
Ketchum’s claim #8: “All I did was compile it and help write the paper.”
This was her biggest mistake. All authors bear responsibility for the paper, especially the principle investigator, her. She did not understand the results of the others and simply accepted them because they supported her foregone conclusions. I didn’t understand all of her paper on first reading, but I educated myself by reading appropriate works of others so that I understood her paper before I published. This took many readings of the paper and many database searches (hundreds).
Ketchum’s claim #9: “You also have to take into account the entirety of the testing, not just one sequence. Due to the unknown sequences, the bioinformaticist was only able to compile certain lengths of sequence without hitting an area that aligned with nothing in Genbank. Don’t take my word for any of this since he wrote the genomics part of the paper. It’s not my words, but the expert in the field. Finally, you have to look at ALL the testing including the mtDNA, not just the one conserved sequence that was assembled by the University of TX. I hope you can clearly understand this.”
The whole is the sum of the parts. If any part does not support the conclusion, then the whole conclusion needs to be reexamined. Further, anything vaguely resembling a primate (which she claims) should match some primates, though not 100%. It goes against evolution to say that a species shares nothing in common with any species on earth. She made this claim for dogman on the recent LNM radio interview.
I have looked at ALL her published results, which are summarized in my Table 1 of the Relict Hominoid Inquiry paper, and which I did specifically for the purpose she mentions – considering all the data. Her expert has no degrees in anything close to genetics or bioinformatics (a fancy name for a computer jockey). My results and conclusions have been accepted by Real experts in genetics and related fields. Her results and conclusions were rejected by two journals, for many of the same reasons that I have criticized them.
As mentioned previously the “University of Texas” is really the “University of North Texas,” and her “expert” is coauthor Fan Zhang.
Nobody has come forward and pointed out the mistakes in my work, in specific details. Her general opinions and comments don’t count for much in science. It will be very interesting to see who comes forth to support her and to disprove my conclusions. She can comment on my RHI paper in the RHI online journal.
Haskell, what’s your take on the Bigfoot phenomenon in general? What are the most likely explanations for the sasquatch in your view?
I draw no conclusions so far. So many personal accounts, but so little scientific evidence. I hold out for a documented body or body part.
I’ve heard numerous times from skeptics that Bigfoot DNA should have been found and properly analyzed by now. Do you agree? What are the obstacles?
It may have been, if it is very human like. Numerous samples tested human. No scientific obstacles to testing the DNA, but samples so far, except no. 37, have no documentation of the source. Somebody needs to take a picture or video of the source of their DNA sample, or bring a whole body in.
For more infos visit Haskell’s website bigfootclaims.blogspot.com or read his two papers:
– Hart, Haskell, Not Finding Bigfoot in DNA, in: Journal of Cryptozoology (volume 4, December 2016), URL: http://www.journalofcryptozoology.com/
– Hart, Haskell, DNA as evidence for the existence of relict hominoids, Relict Hominoid Inquiry, 2016, URL: http://www2.isu.edu/rhi/pdf/HART-DNA-Evidence.pdf
The paper by Melba Ketchum can be read here.
Alabama / Anthropology / Bigfoot / Bigfoot DNA / California / Controversies / Cryptozoology / Erickson Project / Fan Zhang / Genetics / Haskell Hart / Illinois / Justin Smeja / Melba Ketchum / Sasquatch / Science / Sierra Nevada / Skepticism / Stan Courtney / Study