US-20260125710-A1 - EUKARYOTE QUADRUPLET EXPANDED DNA (QED) GENETIC CODE

US20260125710A1US 20260125710 A1US20260125710 A1US 20260125710A1US-20260125710-A1

Abstract

The Quadruplet Expanded DNA (QED) eukaryote genetic code comprising twenty nondegenerate QED codons encode proteins (the protein-encoding codons), and thirty-five nondegenerate QED codons (the noncoding codons) being highly correlated with cis-regulatory elements control and regulate transcription, alternate splicing, and polymerization in eukaryotic protein synthesis using canonical amino acids. The QED eukaryote genetic code is an advancement to gene therapeutics that allows for the correction of dysfunctional proteins. Additionally, the QED eukaryote genetic code is further applicable for changing paradigms relating to identifying cures for monogenic rare, multigene cancer, and neurodegenerative diseases.

Inventors

Rama Shankar Singh

Assignees

Rama Shankar Singh

Dates

Publication Date: 20260507
Application Date: 20241102

Claims (14)

1 . A genetic code applicable to eukaryotic cells, prokaryotic cells and viruses, the genetic code comprising: a Quadruplet Expanded DNA (QED) genetic code having a quadruplet codon structure, with each codon of the quadruplet codon structure including four consecutive DNA bases of A, T, C, and G, to thereby expand the genetic code from a triplet codon structure to a quadruplet codon structure; a first set of twenty (20) independent protein-encoding QED codons, with each protein-encoding QED codon within the first set including canonical amino acids; and a second set of thirty-five (35) independent noncoding QED codons, with each noncoding QED codon within the second set being utilized for a cellular regulatory mechanism, and with each noncoding QED codon within the second set being used to regulate pre-mRNA splicing of a plurality of mRNA in order to obtain a plurality of exons.
2 . The genetic code of claim 1 , wherein an order and arrangement of bases for first set of protein-encoding QED codons and the second set of noncoding QED codons are position independent.
3 . The genetic code of claim 1 , wherein an order and arrangement of bases for the first set of protein-encoding QED codons and the second set of thirty-five noncoding QED codons are symmetrical.
4 . The genetic code of claim 1 , wherein the second set of noncoding QED codons initiate a first transcription process.
5 . The genetic code of claim 1 , wherein the cellular regulatory mechanism utilized by the second set of noncoding QED codons is used to identify exon-intron interfaces.
6 . The genetic code of claim 1 , wherein the second set of noncoding QED codons initiate the spliceosome process.
7 - 10 . (canceled)
11 . The genetic code of claim 1 , wherein the QED genetic code further including: a codon-anticodon pairing, with an anticodon of a QED codon from the first set of protein-encoding QED codons acting as an encoding QED codon for a first canonical amino acid sequence, with the amino acid being a canonical amino acid; and the number of hydrogen bonds are maintained between the anticodon of the QED codon encoding the first canonical amino acid sequence and the encoding QED codon of the second canonical amino acid sequence.
12 . The genetic code of claim 11 , wherein the codon-anticodon pairing reduces a number of tRNA molecules required for protein synthesis.
13 . The genetic code of claim 1 , wherein the QED genetic code is used to transfer a first portion of a dysfunctional protein to a functional protein.
14 . The genetic code of claim 13 , wherein the transfer of the first portion of the dysfunctional protein to the functional protein is performed by a first set of reverse QED codons correcting an amino acid sequence to obtain a corrected mRNA sequence.
15 . The genetic code of claim 14 , wherein a second set of reverse QED codons are used to perform a reverse transcription operation to the dysfunctional protein to obtain a corrected protein.
16 . The genetic code of claim 1 , wherein the QED genetic code is used to transfer a first portion of a dysfunctional protein to a complementary DNA (cDNA) sequence.
17 . The genetic code of claim 16 , wherein the transfer of the first portion of the dysfunctional protein to the cDNA sequence is performed by the QED codon translating mRNA to obtain a corrected protein.

Description

CROSS REFERENCE INFORMATION This application claims the benefit of U.S. Provisional Application No. 63/536,566, filed on Sep. 5, 2023, which is incorporated herein by reference in its entirety. BACKGROUND The present invention generally relates to the field of genetics, and more specifically to utilizing a novel genetic code for various gene therapy applications and for treating other medical conditions. Genetic Coding History Since 1962: Pre-1970 Genetic Code Limited to Prokaryotes and Viruses The pre-1970 triplet genetic coding was proposed once the structure of DNA (References 1, 2) was established by Francis Harry Compton Crick, James Dewey Watson, and Maurice Hugh Frederick Wilkins with an award of the 1962 Nobel Prize in Physiology and Medicine to them (Reference 3). The DNA has four T, A, C, and G bases such that T: A forms double hydrogen bonds and C: G triple hydrogen bonds naturally form complementarity pairs, known as Watson-Crick (WC) pairs. Furthermore, Crick introduced the concept of the central dogma of Biology, where DNA is considered the hereditary material protein synthesis occurs from DNA to mRNA, and triplet coding translates into protein (References 4, 5, 6). The triplet combination out of four DNA bases yielded 64 possible codons that were verified by Robert W. Holley, Har Gobind Khorana, and Marshal W. Nirenberg by the award of the 1968 Nobel Prize in Medicine and Physiology to them (Reference 7): 61 triplet codons encode twenty amino acids, 3 STOP signals, and one START signal. However, these authors used different complementary techniques to verify the codons. Khorana used the synthesis process (Reference 8); Nirenberg used the enzymatic binding process (References 9, 10); and Holley used the structure of tRNA (Reference 11) with attached amino acids and anticodons. At the Ribosome, tRNA anticodons form WC pairs with mRNA triplet bases, resulting in a protein. When a perfect WC pair did not occur, the wobble hypothesis was introduced to accommodate it. The triplet code is not an optimal coding. Originally, Crick proposed it as a coding problem where four DNA bases, T, A, C, and G, will encode 20 amino acids. According to Shannon's information coding theory, the optimal number of required bits to encode N objects is log 2 N. Thus, for N=20 amino acids, the optimal number of required bits will be log2 20=4.32 bits. However, the triplet code has 64 codons, requiring 6 bits. Therefore, it is nonoptimal and degenerate. The triplet coding is degenerate, where multiple codons code the same amino acid. Additionally, there are twenty amino acids but not twenty tRNAs, so iso-tRNA was proposed to decode multiple amino acids. The triplet code was considered universal under the central dogma of biology (DNA to mRNA to protein). However, viruses violated this rule in which viral mRNA is the starting point, rather than DNA. The protein production starts with viral mRNA to complementary DNA (cDNA) by reverse transcriptase to generate mRNA, then protein. The most critical limitation of the central dogma of biology is the one gene-one protein hypothesis valid for prokaryotes but failed ultimately for eukaryotes where one gene-multiple proteins are possible. The triplet code has no gene control mechanism. François Jacob and Jacques Monod developed (Reference 12) a gene regulatory mechanism in prokaryotes by synthesizing a cluster of enzymes, called operons, to control mRNA genes. The operons are either negative or positive control and are not mutually exclusive. The 1965 Nobel Prize in Physiology or Medicine (Reference 13) was awarded to François Jacob, André Lwoff, and Jacques “for their discoveries concerning genetic control of enzyme and virus synthesis.” Post-1970 DNA Code Required for Eukaryotes Post-1970 research on molecular and cellular biology and genetics showed that eukaryotes require transcription, splicing, and various regulatory and control processes, including epigenetics, in the cell. In about 1977 (References 14, 15), it was shown that less than 2% of DNA bases encode proteins, and the remaining bases are noncoding that regulate the protein synthesis process. Additionally, the genes were not continuously distributed. They were like beads on a string of coding portions (exons) separated by noncoding (introns), and a splicing process was required to separate them. Richard J. Roberts and Phillip A. Sharp demonstrated the existence of “split genes” and were awarded the 1993 Nobel Prize in Physiology or Medicine (Reference 16). Multiple proteins were synthesized from a single gene (References 17, 18) using alternate splicing, thus breaking one gene-one protein hypothesis of the central dogma of biology. In eukaryotes, transcription yields pre-mRNA, followed by splicing, which generates mRNA for protein synthesis. Roger Kornberg elucidated the detailed transcription process in eukaryotes using Baker's yeast as a eukaryotic model and an X-ray structural analysis technique. He showed that the