Crime Scene Report 犯罪现场报告
Victim and Suspect were hiking along a remote trail in the Mojave Desert. By the time Victim and Suspect were able to hike back to the trailhead and receive medical attention, Victim was in critical condition. Suspect reports that it took approximately 10 hours to hike back to the trail head due to rough terrain and Victim’s weakened condition. Suspect also reports trying to suck the venom out of Victim’s leg. (NOTE: this is not effective or safe. In case of a snake bite, seek medical attention and do not try to suck out venom). Victim died en route to the hospital.
受害者和嫌疑犯沿着一条僻静的小径在莫哈韦沙漠远足。当受害者和嫌疑犯走回小路的起点并获得医护人员的协助时,受害者已经处在严重的情况。嫌疑犯报告他们大概要花10个小时走回起点,因为崎岖的地形和受害者虚弱的身体情况。嫌疑犯同时报告他试着把毒液从受害者腿上吸出来。(注意:这并没有效果而且不安全。如果被蛇咬了,要寻求医护人员的帮助而且不要吸毒液)。受害者在去往医院的途中去世了。
However, we have reason to believe there may have been foul play. Suspect had a suspicious search history, including repeated searches of the effects of King Cobra venom on humans and suspicious communications with a local reptile store. Investigators worry that Suspect may have tried to inject King Cobra venom into a fake snake bite wound site.
然而,我们有理由相信这可能是谋杀。嫌疑犯有可疑的检索历史记录,包括重复搜索眼镜王蛇的毒液在人身上的效果和与当地一家爬虫商店可疑的聊天记录。调查者担心嫌疑犯有可能将眼镜王蛇的毒液注射到伪造的蛇咬伤口。
Forensic scientists called to the scene swabbed Victim’s bite wound and sent samples off for DNA sequencing.
法医在现场擦拭了受害者的咬伤伤口并将样本送去DNA测序。
The toxicologist have identified 4 proteins that may be of interest:
毒理学家已经确认了4种可能会引起注意的蛋白质。
1.AMYS_HUMAN
2.HRTD_CROAT
3.ALBU_HUMAN
4.VESP_OPHHA
DNARNAProtein: Biology Refresher Information
This information is also covered in the video attached to this lab assignment!
Translating DNA Sequences to proteins: A set of three nucleotides is called a “codon”. Each codon corresponds to an amino acid or STOP. STOP is a signal to stop translation. A protein is made of a string of amino acids.
将DNA序列转化为蛋白质:一组3个核苷酸被称为1个密码子。每个密码子对应1个氨基酸或STOP。STOP是停止转化的信号。一个蛋白质是由一串氨基酸组成的。
Open Reading Frames: A single DNA sequence can contain the information to encode many different amino acid sequences.
开放阅读框:单个DNA序列包含了信息去编码许多不同的氨基酸链。
When searching for potential proteins encoded by a DNA sequence, we look for sequences sandwiched between START and STOP codons (with no other stop codons in between!). These sequences are called Open Reading Frames (ORFs). In the diagram below, STOP codons are indicated by .
当搜寻由DNA序列编码成的潜在的蛋白质时,我们寻找加在开始和停止密码子中间的序列(没有其他停止密码子在中间)。这些序列被称为开放阅读框(ORFs)。下面的图表中,停止密码子用表示。
(image from: https://www.mun.ca/biology/scarr/Reading_Frames_in_mtDNA.html)
Assumptions in this lab: For the purpose of this lab, we are ignoring some really interesting layers of complexity. We are making the over-simplified assumption that DNA gets translated directly into functional proteins. But, this ignores some really cool processes including post-transcriptional mRNA splicing/capping/polyadenylation and post-translational protein modifications. (这一句都是生物术语不会翻,不影响做题)Here’s a fun Crash Course video, if you’re interested in exploring the biology of this further: https://www.youtube.com/watch?v=itsb2SqR-R0.
本实验的假设:为了本实验的目的,我们忽略了一些很有趣的复杂层面。我们做过于简化的假设:DNA直接转化成有功能的蛋白质。
Given Information
Modules you’re allowed (but not required) to import: os, numpy, my given codon table in codon.py. If you want to import any other module, please pre-approve it with me first. Thank you!
.txt files: will receive two .txt files, each in FASTA format. FASTA format is a text file where a description of the sequence is written on one line in the format: “>Description” and the sequence itself is written on the following lines. The start of a new sequence is indicated by a new “>Description” line.
The first .txt file contains the DNA sequences from the “Lab Results”.
The second .txt file contains amino acid sequences for “Proteins of Interest”.
.py files:
You will receive a codon.py file containing a dictionary whose keys are codons and values are the associated amino acid.
You will receive a lastname_lab3.py file containing starter code for this lab. The starter code contains some helpful hints and provides a general structure for your work.
You will receive a webb_lab3_test.py file containing test cases for your read_FASTA, dna2protein, and findORFs functions. You will also receive a webbtest.py module that is needed to run the webb_lab3_test.py test cases.
Your Task
1.Use the website www.uniprot.org to fill out the chart below. Your answers should be about 2 sentences per protein.
Protein Relevance to the case
AMYS_HUMAN | The enzyme initiates starch digestion in the oral cavity of human, so human’s saliva contains this enzyme. If the suspect tried to suck venom out of victim’s leg, this enzyme should be detected at the snake bite. 这种酶在人类的口腔中开始淀粉的消化,所以人类的唾液中含有这种酶。如果嫌疑犯试过从受害者腿上吸出毒液,这种酶应该会在蛇咬伤口中被发现。 |
HRTD_CROAT | This venom is from Western diamondback rattlesnake rather than from King Cobra. If this is detected at the snake bite, then the snake bite should be real as the victim was truly bitten by Western diamondback rattlesnake. 这种毒液来自西部菱形斑纹响尾蛇而不是眼镜王蛇。如果它在蛇咬伤口被发现,蛇咬伤口应该是真的,因为受害者是被响尾蛇咬的。 |
ALBU_HUMAN | It is a protein that can be found in human blood. Detection of this protein might prove that the victim bled from the bite, while both snake bite and venom injection would cause victim to bleed.它是一种可以在人类血液中被发现的蛋白质。发现这种蛋白质可能证明受害者是因为咬伤出血,然而蛇咬伤和毒液注射都会引起受害者出血。 |
VESP_OPHHA | This toxin is from the king cobra rather than from Western diamondback rattlesnake. So, if this is de |