1
00:00:15 --> 00:04:43
OK.  Parts of a gene.

2
00:04:43 --> 00:04:47
We have our promoter,
which is part of the untranscribed

3
00:04:47 --> 00:04:51
region of a gene,
usually in the 5 prime end.

4
00:04:51 --> 00:04:55
Not always but for the genes we're
talking about at the 5 prime end,

5
00:04:55 --> 00:04:59
the so-called 5 prime end of the
gene, or so-called upstream of this

6
00:04:59 --> 00:05:04
transcribed region.
And downstream of that there is more

7
00:05:04 --> 00:05:09
untranscribed region that
interestingly can also contribute to

8
00:05:09 --> 00:05:13
the promoter, even though it's far
away from this more upstream part of

9
00:05:13 --> 00:05:18
the promoter.  But I'm going to call
it just for now untranscribed,

10
00:05:18 --> 00:05:23
two flanking regions of
untranscribed DNA sequence and one

11
00:05:23 --> 00:05:29
region of transcribed sequence.
Now, I want to discuss with you very

12
00:05:29 --> 00:05:35
briefly a phenomenon called splicing.
And this is a phenomenon that

13
00:05:35 --> 00:05:41
occurs within the RNA that is
transcribed from a gene and,

14
00:05:41 --> 00:05:47
therefore, pertains to the
transcribed region of the gene.

15
00:05:47 --> 00:05:53
It turns out that in this
transcribed region there are two

16
00:05:53 --> 00:05:59
kinds of sequences.
There are things called exons and

17
00:05:59 --> 00:06:05
there are regions called introns.
The exons code for something,

18
00:06:05 --> 00:06:11
code for the final function of the
RNA or for eventually a protein.

19
00:06:11 --> 00:06:17
So these are coding.  The introns
are noncoding.

20
00:06:17 --> 00:06:24
Both of them are transcribed.
You'll see this definition is a

21
00:06:24 --> 00:06:30
little loose as we move on in
today's lecture, but

22
00:06:30 --> 00:06:36
it's good enough.
In the transcript that initially is

23
00:06:36 --> 00:06:42
made from the gene in this
transcribed region,

24
00:06:42 --> 00:06:48
both introns and exons are present.
So these are present in what's

25
00:06:48 --> 00:06:54
called the primary transcript or
primary RNA.  And primary refers to

26
00:06:54 --> 00:07:00
the first RNA that is transcribed
from the gene.

27
00:07:00 --> 00:07:07
And subsequent to that,
still in the nucleus, those introns

28
00:07:07 --> 00:07:14
and exons are subject to a process
called splicing whereby the introns

29
00:07:14 --> 00:07:24
are removed --

30
00:07:24 --> 00:07:32
-- or spliced out is the term,
such that in your mature RNA only

31
00:07:32 --> 00:07:49
the exons are present.

32
00:07:49 --> 00:07:53
This process is likely a consequence.
I'm going to put up a diagram that

33
00:07:53 --> 00:07:57
you had on your last time's handout.
You can watch it now.  And you can

34
00:07:57 --> 00:08:02
refer back to a previous lecture if
you don't have it with you.

35
00:08:02 --> 00:08:07
This notion of introns and exons is
probably a consequence of evolution

36
00:08:07 --> 00:08:12
whereby different parts of genes
were combined and shuffled to give

37
00:08:12 --> 00:08:17
new kinds of genes and,
therefore, new kinds of proteins.

38
00:08:17 --> 00:08:22
Here on my diagram I have exons in
black and introns in blue,

39
00:08:22 --> 00:08:27
and they're all just DNA sequence,
but when the RNA is transcribed in

40
00:08:27 --> 00:08:32
the first place is primary RNA.
It's a copy of the gene.

41
00:08:32 --> 00:08:36
It has both exons and introns.
And then a very complex enzymatic

42
00:08:36 --> 00:08:41
machinery comes on and it loops out
and excises these introns.

43
00:08:41 --> 00:08:46
OK?  So this is very interesting
such that in your mature mRNA there

44
00:08:46 --> 00:08:50
are no introns.
And the introns have been looped

45
00:08:50 --> 00:08:55
out and they form these little
structures that are called lariats.

46
00:08:55 --> 00:09:00
And at this point your mRNA is
mature --

47
00:09:00 --> 00:09:04
-- and it moves to the cytoplasm.
Now, this process what discovered

48
00:09:04 --> 00:09:09
by Professor Phillip Sharp here at
MIT and he got the Nobel Prize for

49
00:09:09 --> 00:09:13
it in 1993.  It's a very important
process because it's absolutely

50
00:09:13 --> 00:09:18
required for maturation of RNAs.
And also, and I'll come to this in

51
00:09:18 --> 00:09:23
a few lecture's time,
it allows different proteins to be

52
00:09:23 --> 00:09:28
made from the same mRNA.
So here's a rule.

53
00:09:28 --> 00:09:33
In this RNA there are what are
called splice donor cites that I've

54
00:09:33 --> 00:09:38
put as a circle and splice acceptor
sites that I've put as a square.

55
00:09:38 --> 00:09:43
Just watch this.  Just watch this
for now because we will come back to

56
00:09:43 --> 00:09:48
it.  So watch what I'm saying rather
than trying to madly write down.

57
00:09:48 --> 00:09:53
Any spliced donor can join to any
splice acceptor and remove the stuff

58
00:09:53 --> 00:09:58
between them.  So in this top
example I've got each introns being

59
00:09:58 --> 00:10:04
neatly removed because splice donors
and inceptors interact.

60
00:10:04 --> 00:10:08
But look at the example below.
I've got this splice donor next to

61
00:10:08 --> 00:10:12
exon one interacting with a spliced
acceptor next to exon three.

62
00:10:12 --> 00:10:16
And when that happens you remove
the hull of exon two.

63
00:10:16 --> 00:10:20
So you actually are going to make a
different protein.

64
00:10:20 --> 00:10:24
Whereas, in the first case you'll
have exons one,

65
00:10:24 --> 00:10:28
two and three and four.
In the second case you'll have

66
00:10:28 --> 00:10:32
exons one, three and four.
OK?  So this process is very

67
00:10:32 --> 00:10:38
important for allowing different
kinds of proteins to be made from

68
00:10:38 --> 00:10:43
the same gene.
I want to make you aware of this

69
00:10:43 --> 00:10:49
now, and I will come back to it in
the formation module when we talk

70
00:10:49 --> 00:10:54
about how different kinds of cells
are generated.

71
00:10:54 --> 00:11:00
All right.  So let's move onto the
major topic of today's lecture --

72
00:11:00 --> 00:11:03
-- which takes us back to the
central dogma.

73
00:11:03 --> 00:11:07
And I want to introduce to you a
term that is very important that you

74
00:11:07 --> 00:11:11
know and you understand.
And this is the term gene

75
00:11:11 --> 00:11:15
expression.  And really what we've
been talking about is

76
00:11:15 --> 00:11:22
gene expression.

77
00:11:22 --> 00:11:28
Gene expression simply refers to the
generation of the final product of a

78
00:11:28 --> 00:11:34
gene from the gene.
So we're talking about the

79
00:11:34 --> 00:11:40
formation of a protein as directed
by a particular gene.

80
00:11:40 --> 00:11:47
OK?  So gene expression is,
if you like, the readout.  Here's

81
00:11:47 --> 00:11:53
another way of putting it.
The readout, the final readout of a

82
00:11:53 --> 00:12:00
gene, or the generation of the final
product of a gene.

83
00:12:00 --> 00:12:05
I'm going to come back to this term
over and over again,

84
00:12:05 --> 00:12:10
and I will ask you to define it in
your own way, but it's a term I want

85
00:12:10 --> 00:12:16
to throw out at you now because you
do need to know it.

86
00:12:16 --> 00:12:21
It's very pervasive.
Today I want to talk about the step

87
00:12:21 --> 00:12:27
in gene expression or translation
whereby RNA is converted or is used

88
00:12:27 --> 00:12:33
to direct synthesis of a protein.
So let's define translation because

89
00:12:33 --> 00:12:39
it is, I think,
one of the most interesting

90
00:12:39 --> 00:12:45
questions in molecular biology.
Certainly from a historical

91
00:12:45 --> 00:12:52
perspective that was true.
And the notion in translation is

92
00:12:52 --> 00:12:58
that the base sequence of a mRNA
somehow leads to the synthesis of a

93
00:12:58 --> 00:13:05
protein with a defined
amino acid sequence.

94
00:13:05 --> 00:13:31
Now, if you think about DNA

95
00:13:31 --> 00:13:35
replication, transcription and
translation, the relationship

96
00:13:35 --> 00:13:40
between them, there is a nice
analogy that one can make.

97
00:13:40 --> 00:13:44
DNA uses the base code, four bases.
Transcription RNA uses those same

98
00:13:44 --> 00:13:48
four bases as a code,
but it's slightly different from DNA.

99
00:13:48 --> 00:13:53
So the synthesis of RNA using a DNA
template is kind of like changing

100
00:13:53 --> 00:13:57
fonts in a document that you have.
It's kind of like going from Times

101
00:13:57 --> 00:14:02
New Roman to Helvetica.
You haven't really changed much.

102
00:14:02 --> 00:14:07
It just looks a bit different.
Translation is very different.

103
00:14:07 --> 00:14:11
The use of mRNA to direct the
synthesis of a protein is much more

104
00:14:11 --> 00:14:16
analogous to changing language where
you've taken English and translated

105
00:14:16 --> 00:14:21
it into Chinese or Russian and
translated it into French.

106
00:14:21 --> 00:14:26
OK?  So this is a really different
process.  And it was clear from the

107
00:14:26 --> 00:14:31
outset, historically,
that one had to think in a slightly

108
00:14:31 --> 00:14:36
different way about how this process
was directed.

109
00:14:36 --> 00:14:41
And I want to talk about four things
with respect to translation.

110
00:14:41 --> 00:14:47
Firstly, I want to talk about the
genetic code that allows RNA to

111
00:14:47 --> 00:14:53
direct protein synthesis.
I want to talk about something

112
00:14:53 --> 00:14:59
called the interpreter
of that code.

113
00:14:59 --> 00:15:03
I'm going to talk about the factory
in which the synthesis takes place.

114
00:15:03 --> 00:15:08
And then I'm going to get to a
discussion of the molecule bases for

115
00:15:08 --> 00:15:28
genotype and phenotype.

116
00:15:28 --> 00:15:33
So let's think about the code.
And thinking about this starts from

117
00:15:33 --> 00:15:38
a very simple logical place.
And the place is this.  One starts

118
00:15:38 --> 00:15:43
with four bases,
A, G, C and T or A,

119
00:15:43 --> 00:15:49
G, C and U, depending if you're
talking about DNA and RNA.

120
00:15:49 --> 00:15:54
And somehow those four bases have
to be used in some kind of code to

121
00:15:54 --> 00:16:00
give you an outcome
of 20 amino acids.

122
00:16:00 --> 00:16:04
And I am going to use the
abbreviation AA for amino acids.

123
00:16:04 --> 00:16:09
So you can look at this and
immediately understand there has to

124
00:16:09 --> 00:16:13
be some kind of combinatorial code
in order to specify those 20 amino

125
00:16:13 --> 00:16:18
acids.  So you can do combinations
and you can say,

126
00:16:18 --> 00:16:23
OK, if two bases were used and you
could have combinations of doublets,

127
00:16:23 --> 00:16:27
how many combinations can you get to
and would that be enough to specify

128
00:16:27 --> 00:16:34
those 20 amino acids?
Well, no, because two base

129
00:16:34 --> 00:16:42
combinations would only give you 16
possible amino acid combinations,

130
00:16:42 --> 00:16:50
or the ability to specify 16 amino
acids.  OK?  Four squared.

131
00:16:50 --> 00:16:59
How about three base combinations?
Well, that's better.

132
00:16:59 --> 00:17:05
What you can get out of that is 64
different combinations.

133
00:17:05 --> 00:17:11
OK?  And that is plenty to specify
your 20 amino acids with some left

134
00:17:11 --> 00:17:17
over.  And, in fact,
this is what is used.

135
00:17:17 --> 00:17:23
Combinations of three bases.
And these combinations of three

136
00:17:23 --> 00:17:32
bases are termed the triplet code.

137
00:17:32 --> 00:17:36
The discovery of the triplet code is
really fascinating.

138
00:17:36 --> 00:17:40
I don't have time to go into it in
this lecture, but your book is not

139
00:17:40 --> 00:17:44
too bad on the discovery.
And I will post on your website,

140
00:17:44 --> 00:17:48
for those of you who really want to
get into it, a reference to a very

141
00:17:48 --> 00:17:52
interesting historical account of
the discovery of the triplet code

142
00:17:52 --> 00:17:56
and indeed of much of molecular
biology.  But it's a fascinating

143
00:17:56 --> 00:18:00
story.  But I'm going to tell you
the code is a triplet code.  OK.

144
00:18:00 --> 00:18:08
So what does that mean?
It means that three bases

145
00:18:08 --> 00:18:16
correspond to a particular amino
acid.  OK?  So one triplet of bases

146
00:18:16 --> 00:18:24
correspond, I'm writing this out
because it's really important that

147
00:18:24 --> 00:18:32
you know this, correspond
to one amino acid.

148
00:18:32 --> 00:18:38
And this base triplet gets a special
name.  It's called a codon.

149
00:18:38 --> 00:18:46
And the thing that you will have

150
00:18:46 --> 00:18:50
noticed is that what I've told you
is there are 64 possible

151
00:18:50 --> 00:18:54
combinations of triplets and only 20
amino acids.  And so that leaves

152
00:18:54 --> 00:18:58
some over.  What happens?
Well, they're all used.

153
00:18:58 --> 00:19:04
And what happens is that although
the code is universal,

154
00:19:04 --> 00:19:10
as far as we know it arose just once,
all living organisms on our planet

155
00:19:10 --> 00:19:17
use this code,
it is a redundant code.

156
00:19:17 --> 00:19:23
So I will write down it is
redundant but not ambiguous,

157
00:19:23 --> 00:19:30
and tell you what that means.
So what that means is that an amino

158
00:19:30 --> 00:19:37
acid can be specified by more than
one triplet, and I'll show you that

159
00:19:37 --> 00:19:44
in a moment, but that any triplet of
bases only corresponds to one amino

160
00:19:44 --> 00:19:51
acid.  Let's look at some diagrams
to show you what I mean.

161
00:19:51 --> 00:19:58
This is a table of your amino acid
code.

162
00:19:58 --> 00:20:02
These letters in columns represent
the bases.  And next to them are

163
00:20:02 --> 00:20:07
written the amino acids that
correspond to this particular code.

164
00:20:07 --> 00:20:11
Let's start with an easy one.  This
is methionine encoded by AUG.

165
00:20:11 --> 00:20:16
And that's one you should actually
remember.  OK?

166
00:20:16 --> 00:20:20
And for methionine there is only
one possible codon.

167
00:20:20 --> 00:20:25
It is AUG and always AUG.
But let's keep going here.

168
00:20:25 --> 00:20:30
And let's look at the amino acid
lucine.

169
00:20:30 --> 00:20:36
lucine is encoded by six possible
triplets, six possible codons,

170
00:20:36 --> 00:20:43
UAA, UAG, CUU, CUC, CUA and CUG.
Any one of those in a mRNA can

171
00:20:43 --> 00:20:49
encode lucine.
However, CUU only encodes lucine.

172
00:20:49 --> 00:20:56
It never encodes another amino acid.
OK?  And that's what

173
00:20:56 --> 00:21:04
I mean by redundant.
More than one triplet can encode one

174
00:21:04 --> 00:21:12
amino acid, but any given triplet
only corresponds to one particular

175
00:21:12 --> 00:21:20
amino acid.  OK.
You will have practice on this kind

176
00:21:20 --> 00:21:28
of thing as you go along.
So let's get some basics down here.

177
00:21:28 --> 00:21:36
The template in the whole
translation process is your mRNA.

178
00:21:36 --> 00:21:41
OK?  It's the code.
It contains the code.

179
00:21:41 --> 00:21:46
It is read to give a protein
readout from 5 prime to 3 prime.

180
00:21:46 --> 00:21:51
And the readout of the protein, as
I mentioned to you way back when,

181
00:21:51 --> 00:21:57
reads out from the amino to the
carboxyl end.  New amino acids are

182
00:21:57 --> 00:22:05
added onto the carboxyl end --
-- and the free amino group

183
00:22:05 --> 00:22:15
corresponds to the first amino acid
polymerized.  So it is read 5 prime

184
00:22:15 --> 00:22:25
to 3 prime, and that corresponds to
the amino to the carboxy

185
00:22:25 --> 00:22:37
growth of the protein.

186
00:22:37 --> 00:22:44
All mRNAs start with the same amino
acid, and that is methionine.

187
00:22:44 --> 00:22:51
And the start or initiation codon
in all proteins is methionine,

188
00:22:51 --> 00:22:59
oops, is AUG which encodes
methionine.

189
00:22:59 --> 00:23:03
Now, not all final proteins have got
methionine at their amino ends

190
00:23:03 --> 00:23:08
because it can be cleaved off.
OK?  So you don't have to land up

191
00:23:08 --> 00:23:13
with a protein that has a methionine
end, its amino end,

192
00:23:13 --> 00:23:18
but it starts off with methionine
there.  And then there are no gaps

193
00:23:18 --> 00:23:22
in the message.
It is read without any punctuation

194
00:23:22 --> 00:23:27
marks, except for the fact that the
codons are next to one another in a

195
00:23:27 --> 00:23:33
non-overlapping way.  OK?
So there are no gaps.

196
00:23:33 --> 00:23:39
And the only punctuation is the
start codon and a series of stop

197
00:23:39 --> 00:23:45
codons which do not encode any amino
acids.  These are UAA,

198
00:23:45 --> 00:23:51
UAG and UGA.  And you can remember
them if you want,

199
00:23:51 --> 00:23:57
but we're not going to test that you
do.  OK?  You can use your

200
00:23:57 --> 00:24:02
amino acid tables.
OK.  So your punctuation is the

201
00:24:02 --> 00:24:08
start and the end of the message.
All right.  So let's go on and talk

202
00:24:08 --> 00:24:14
about the interpreter and what I
mean by the interpreter.

203
00:24:14 --> 00:24:20
In this diagram here I have got,
look up here for a moment.  This is

204
00:24:20 --> 00:24:26
quite a nice diagram not from your
book.  I've got your DNA strand,

205
00:24:26 --> 00:24:31
which is your template strand.
Your corresponding RNA,

206
00:24:31 --> 00:24:35
your mRNA, and the readout of the
RNA to the protein.

207
00:24:35 --> 00:24:40
And here are the codons,
UGG, this is in the middle of the

208
00:24:40 --> 00:24:44
protein so that's why there's no
methionine, UGG corresponding to

209
00:24:44 --> 00:24:48
tryptophan, UUU corresponding to
phenylalanine.

210
00:24:48 --> 00:24:53
You can see how the codons are
right next to each other,

211
00:24:53 --> 00:24:57
OK, but do not overlap.  In fact,
I'm going to write that on the board.

212
00:24:57 --> 00:25:02
So no gaps and no codon overlap.
Very important that you understand

213
00:25:02 --> 00:25:07
that.  So when people looked to this
and figured out what the codons

214
00:25:07 --> 00:25:11
corresponded to in terms of amino
acids there was the question of,

215
00:25:11 --> 00:25:16
well, how do you actually get those
amino acids corresponding to those

216
00:25:16 --> 00:25:21
codons?  And there was a sense that
you needed some kind of adapter or

217
00:25:21 --> 00:25:25
interpreter molecule that both
recognized the codon and recognized

218
00:25:25 --> 00:25:30
the amino acid.
And that's the next thing that I'm

219
00:25:30 --> 00:25:40
going to tell you.  And --

220
00:25:40 --> 00:25:46
-- stop.  Well,
I apologize on behalf of our

221
00:25:46 --> 00:25:53
illustrious institute for the boards
in this room.  OK.

222
00:25:53 --> 00:26:00
So all right.  So let's talk about
interpreter.

223
00:26:00 --> 00:26:07
And I'll tell you that this is the
class of RNA someone brought up

224
00:26:07 --> 00:26:14
earlier called tRNAs.
So tRNA, as you may recall,

225
00:26:14 --> 00:26:21
are these very small RNAs.  There
are about 100 base pairs,

226
00:26:21 --> 00:26:28
100 bases in length, and there are a
lot of them.  And there is a tRNA

227
00:26:28 --> 00:26:35
that corresponds to every codon.
So tRNAs recognize both the amino

228
00:26:35 --> 00:26:42
acid and the specific codon.
And they recognize,

229
00:26:42 --> 00:26:49
let's talk about the codon first.
They recognize the codon by DNA

230
00:26:49 --> 00:26:56
complement, by RNA complementarity,
by base pairing to a region on the

231
00:26:56 --> 00:27:05
tRNA called the anti-codon.

232
00:27:05 --> 00:27:15
So let's talk about methionine for a
moment.  The codon for methionine is

233
00:27:15 --> 00:27:25
AUG.  That's the codon.
Woops.  Hold on one second here.

234
00:27:25 --> 00:27:33
5 prime AUG, that's your codon.
And what will be complementary to

235
00:27:33 --> 00:27:40
that on the tRNA from the
3 prime end is UAC.

236
00:27:40 --> 00:27:47
OK?  So this anti-codon is on the

237
00:27:47 --> 00:27:52
tRNA.  Anti-codons can either be
written from the 3 prime end or you

238
00:27:52 --> 00:27:57
can switch them around and talk
about 5 prime CAU.  It's

239
00:27:57 --> 00:28:02
the same thing.  OK?
So that's one thing.

240
00:28:02 --> 00:28:06
I'll show you a picture in a moment.
The other thing that a tRNA has to

241
00:28:06 --> 00:28:11
recognize is the amino acid.
And that's more complicated.

242
00:28:11 --> 00:28:16
For different amino acids there are
different parts of the tRNA molecule

243
00:28:16 --> 00:28:20
that recognizes specific amino acids.
And it hasn't actually been figured

244
00:28:20 --> 00:28:25
out completely which part of which
tRNA recognizes a particular amino

245
00:28:25 --> 00:28:30
acid, but the recognition
is also on the tRNA --

246
00:28:30 --> 00:28:35
-- and not really on the anti-codon.
Or certainly not the anti-codon

247
00:28:35 --> 00:28:40
alone is probably fair to say.
So let me show you a picture of a

248
00:28:40 --> 00:28:46
tRNA.  tRNAs are single-stranded
RNAs that fold up on themselves in a

249
00:28:46 --> 00:28:51
complex way.  OK?
Here's the representation of the

250
00:28:51 --> 00:28:57
three-dimensional structure of a
tRNA.  And these cross things

251
00:28:57 --> 00:29:02
are hydrogen bonds.
So there's a lot of base-pairing

252
00:29:02 --> 00:29:07
within the tRNA.
Represented more simply,

253
00:29:07 --> 00:29:12
the tRNA forms this kind of
cloverleaf structure,

254
00:29:12 --> 00:29:17
and the anti-codon is at one end of
the tRNA.  OK?

255
00:29:17 --> 00:29:22
So this is the thing that's base
pairing to the codon and the mRNA.

256
00:29:22 --> 00:29:27
The amino acid attaches to the very
3 prime end of the tRNA at this site

257
00:29:27 --> 00:29:33
which is a CCA.  OK?
And there is a covalent attachment

258
00:29:33 --> 00:29:41
of the tRNA to the amino acid at
this CCA region.

259
00:29:41 --> 00:29:49
All right.  But the part that
recognizes the amino acid can be

260
00:29:49 --> 00:29:57
somewhere in the rest of the tRNA
molecule.  It's very complex.

261
00:29:57 --> 00:30:03
OK.  So let's move on now.
Actually, let me tell you one more

262
00:30:03 --> 00:30:09
thing, though I'll tell it to you in
a moment.  OK.

263
00:30:09 --> 00:30:14
So let's move on now to the
question of the factory.

264
00:30:14 --> 00:30:19
And by factory I mean the place
where protein synthesis or

265
00:30:19 --> 00:30:25
translation takes place.
And the factory here is the

266
00:30:25 --> 00:30:30
ribosome.
We mentioned ribosomes right at the

267
00:30:30 --> 00:30:36
beginning of the course in the
second lecture and haven't said a

268
00:30:36 --> 00:30:42
whole bunch about them since.
Ribosomes are very large structures.

269
00:30:42 --> 00:30:48
They are not membrane bound,
but they are very large.  This is a

270
00:30:48 --> 00:30:54
representation of a ribosome from
bacteria that has a small subunit

271
00:30:54 --> 00:31:01
and a large subunit.
And, interestingly,

272
00:31:01 --> 00:31:09
ribosomes are an obligatory complex
between the so-called rRNA,

273
00:31:09 --> 00:31:17
or ribosomal RNA, plus proteins.
There is a small subunit, this is

274
00:31:17 --> 00:31:25
really bad.  Let's try this one.
Small subunit which consists of one

275
00:31:25 --> 00:31:33
ribosomal RNA of a particular kind
and 33 proteins.

276
00:31:33 --> 00:31:37
And there is a large subunit.
And I tell you this not because you

277
00:31:37 --> 00:31:42
need to remember this,
but you need to appreciate that this

278
00:31:42 --> 00:31:46
is a very complex structure.
It's a very cool and complex

279
00:31:46 --> 00:31:51
structure.  The large subunit
comprises of three RNAs

280
00:31:51 --> 00:31:56
and 45 proteins.
You can represent the structure of

281
00:31:56 --> 00:32:00
the ribosome much more beautifully
in this diagram,

282
00:32:00 --> 00:32:04
or in this representation,
where the RNA is shown in gold,

283
00:32:04 --> 00:32:08
or the two RNAs are shown in gold,
or the multiple RNAs are shown in

284
00:32:08 --> 00:32:12
gold, and some of the proteins are
shown as these other structures and

285
00:32:12 --> 00:32:16
you can see the alpha helices of the
proteins.  OK?

286
00:32:16 --> 00:32:20
And what you should be able to see
on this diagram,

287
00:32:20 --> 00:32:24
let me point to this one for a
change, is this tunnel,

288
00:32:24 --> 00:32:28
this hole through the structure.
And this is the tunnel through which

289
00:32:28 --> 00:32:33
the mRNAs thread as it is translated.
So this is truly a factory.

290
00:32:33 --> 00:32:39
tRNAs come into this, the mRNA
threads through,

291
00:32:39 --> 00:32:44
and as that takes place so the mRNA
directs the synthesis of the protein.

292
00:32:44 --> 00:32:49
OK.  This is a representation from
your book.  I don't like most of the

293
00:32:49 --> 00:32:54
diagrams from your book so I redrew
most of them for you,

294
00:32:54 --> 00:33:00
but I left this one.  This is a
representation of translation.

295
00:33:00 --> 00:33:05
The mRNA is shown in green and the
large subunit and small subunit of

296
00:33:05 --> 00:33:10
the ribosome come together,
form the complete ribosome, and then

297
00:33:10 --> 00:33:15
the mRNA actually is thread through
the ribosome and the protein,

298
00:33:15 --> 00:33:21
well, here they've called it a
polypeptide chain is thread through.

299
00:33:21 --> 00:33:26
So let's explore this in a big more
detail.  And in order to do so,

300
00:33:26 --> 00:33:32
I've got to conserve boards here
because we are one board short.

301
00:33:32 --> 00:33:36
In order to do so I need to
introduce you to the various parts

302
00:33:36 --> 00:33:41
of a mRNA.  And this is on one of
the diagrams that I handed out today.

303
00:33:41 --> 00:33:46
OK?  So you don't need to redraw it.
Just look at the diagram.

304
00:33:46 --> 00:33:51
In the mRNA, and this is crucial
for translation,

305
00:33:51 --> 00:33:56
there are three parts that are
really important.  Two

306
00:33:56 --> 00:34:02
of them, excuse me.
Two of them are actually added to

307
00:34:02 --> 00:34:08
the mRNA after it is transcribed.
The thing at the very 5 prime end

308
00:34:08 --> 00:34:14
called the cap and something at the
very 3 prime end,

309
00:34:14 --> 00:34:21
which is a long string of up to a
couple of hundred A residues

310
00:34:21 --> 00:34:27
contiguous, which is called the poly
A tail.  And these parts of the mRNA

311
00:34:27 --> 00:34:34
are crucial for the first part of
translation which is initiation.

312
00:34:34 --> 00:34:39
As in replication and transcription,
you can divide up these synthetic

313
00:34:39 --> 00:34:45
processes into different steps.
And initiation is the first step.

314
00:34:45 --> 00:34:51
And in order for initiation to
occur one needs the parts of the RNA

315
00:34:51 --> 00:34:57
that are added on
post-transcriptionally.

316
00:34:57 --> 00:35:03
You need the cap, this poly A tail --
-- and also a region that is just

317
00:35:03 --> 00:35:11
upstream or 5 prime of this AUG
initiated codon in a region called

318
00:35:11 --> 00:35:19
the UTR, the 5 prime UTR which
stands for untranslated region.

319
00:35:19 --> 00:35:26
And you also need the AUG codon.
OK.  And what happens is that the

320
00:35:26 --> 00:35:34
ribosome and various initiation
proteins bind to the 5 prime cap and

321
00:35:34 --> 00:35:40
simultaneously to the poly A tail.
So this is really cool.

322
00:35:40 --> 00:35:45
The mRNA is translated as a circle
where this poly A tail,

323
00:35:45 --> 00:35:49
the very 3 prime end is brought all
the way around to the 5 prime end,

324
00:35:49 --> 00:35:54
and you get a whole mess of proteins
sitting on that part of the RNA and

325
00:35:54 --> 00:36:00
starting translation.
So you get initiation proteins,

326
00:36:00 --> 00:36:08
which are called initiation factors,
and you get ribosome assembly where

327
00:36:08 --> 00:36:15
the small subunit and the large
subunit come together,

328
00:36:15 --> 00:36:23
and you get a tRNA carrying a
methionine amino acid coming and

329
00:36:23 --> 00:36:36
sitting on the AUG.

330
00:36:36 --> 00:36:40
OK.  Let me show you more.
So here we have a cartoon,

331
00:36:40 --> 00:36:45
you have this in front of you but
I'm going to show it to you in a

332
00:36:45 --> 00:36:50
step-wise fashion,
of this ribosome recognition

333
00:36:50 --> 00:36:55
sequence.  Actually,
I'm not going to show you now but in

334
00:36:55 --> 00:37:00
your handout there are pictures of
the circular RNAs being translated.

335
00:37:00 --> 00:37:04
OK?  That's something new and it's
something very interesting.

336
00:37:04 --> 00:37:09
I'm not going to dwell on it now.
OK?  Where the poly A tail comes

337
00:37:09 --> 00:37:13
all the way around to that 5 prime
so-called cap region.

338
00:37:13 --> 00:37:18
I should just point out,
again, I'm not going to dwell on it,

339
00:37:18 --> 00:37:22
the so-called 5 prime cap region is
a modified guanine.

340
00:37:22 --> 00:37:27
OK?  MEG stands for methyl guanine.
You can call it the cap.  It

341
00:37:27 --> 00:37:32
designated the very 5 prime
end of the message.

342
00:37:32 --> 00:37:39
OK.  So let us look at the sequence
of translation.

343
00:37:39 --> 00:37:47
And what I'm going to tell you,
before I go through the cartoon, is

344
00:37:47 --> 00:37:54
that in the elongation process
sequential tRNAs carrying their

345
00:37:54 --> 00:38:02
particular amino acids are
going to come in.

346
00:38:02 --> 00:38:06
And they're going to sit on these
various codons.

347
00:38:06 --> 00:38:11
And peptide bonds are going to form
between adjacent amino acids so you

348
00:38:11 --> 00:38:16
get the polypeptide chain growing.
OK?  So let's start off with the

349
00:38:16 --> 00:38:20
initiator.  There's your tRNA that
is joined to methionine.

350
00:38:20 --> 00:38:25
And I need to introduce you to a
term now which is a charged,

351
00:38:25 --> 00:38:33
I didn't have space before.
The term charged tRNA refers to the

352
00:38:33 --> 00:38:43
tRNA covalently linked to its amino
acid.  And then correspondingly the

353
00:38:43 --> 00:38:54
uncharged tRNA has no amino acid.
The amino acid has fallen off or

354
00:38:54 --> 00:39:02
been used.  OK.
So there is a tRNA sitting on the

355
00:39:02 --> 00:39:06
first codon, the AUG,
and that's the start of the sentence.

356
00:39:06 --> 00:39:11
That positions the beginning of the
protein.  Now,

357
00:39:11 --> 00:39:15
watch what happens.
Here comes another tRNA that

358
00:39:15 --> 00:39:19
corresponds to lucine,
and you're getting base pairing here.

359
00:39:19 --> 00:39:24
That first tRNA is base paired to
the AUG codon through its anti-codon.

360
00:39:24 --> 00:39:28
The second tRNA is base paired to
the second codon through

361
00:39:28 --> 00:39:34
its anti-codon.
And now you've got a methionine tRNA

362
00:39:34 --> 00:39:40
sitting next to a lucine tRNA.
OK.  Everyone with me here?  And

363
00:39:40 --> 00:39:46
what happens now is that a peptide
bond forms between the methionine

364
00:39:46 --> 00:39:52
and the lucine.
In particular, this methionine is

365
00:39:52 --> 00:39:58
going to move over to that lucine
over there and lead to uncharging of

366
00:39:58 --> 00:40:04
that particular tRNA.
Take a look.  OK,

367
00:40:04 --> 00:40:10
so I've shown you that methionine is
going to form a peptide bond with

368
00:40:10 --> 00:40:16
the lucine.  Now,
watch what happens next.

369
00:40:16 --> 00:40:21
Here's the methionine tRNA.
It's lost its amino acid, OK,

370
00:40:21 --> 00:40:27
so it falls off the message.  It's
done its thing.  It's

371
00:40:27 --> 00:40:33
no longer needed.
Along comes, no,

372
00:40:33 --> 00:40:39
sitting there is this lucine tRNA
which is now covalently attached to

373
00:40:39 --> 00:40:45
its peptide bond to the methionine.
And there's a free amino end here

374
00:40:45 --> 00:40:51
which designates the first amino
acid synthesized in a polypeptide

375
00:40:51 --> 00:40:57
chain.  And here comes in the next
tRNA that corresponds to a serine

376
00:40:57 --> 00:41:03
tRNA based paired by its codon,
base paired by its anti-codon to the

377
00:41:03 --> 00:41:08
codon on the mRNA.
OK?  And the same thing is going to

378
00:41:08 --> 00:41:13
happen again.  The lucine and the
methionine is going to be

379
00:41:13 --> 00:41:17
transferred over and make a peptide
bond with the serine,

380
00:41:17 --> 00:41:22
and so you get elongation of the
polypeptide chain.

381
00:41:22 --> 00:41:27
So what I'm going to write under
elongation is that adjacent

382
00:41:27 --> 00:41:35
amino acids join.

383
00:41:35 --> 00:41:44
Uncharged tRNAs leave,
are released, and sequentially new

384
00:41:44 --> 00:41:54
tRNAs corresponding
to codons come in.

385
00:41:54 --> 00:42:20
OK.  All right.

386
00:42:20 --> 00:42:26
So this whole process goes on until
the mRNA, until the ribosome and all

387
00:42:26 --> 00:42:31
these tRNAs reach a place in the
mRNA where there is a codon that

388
00:42:31 --> 00:42:37
doesn't correspond
to an amino acid.

389
00:42:37 --> 00:42:43
A so-called stop codon.
And at this point there is a

390
00:42:43 --> 00:42:49
process called termination where
there is a stop codon that does not

391
00:42:49 --> 00:42:55
code for any amino acid and doesn't
have a corresponding tRNA therefore.

392
00:42:55 --> 00:43:01
And at this point the protein
polypeptide chain falls

393
00:43:01 --> 00:43:12
off the message.

394
00:43:12 --> 00:43:18
All right.  You guys OK with that?
OK.  I'm going to refer you, I'm

395
00:43:18 --> 00:43:24
not going to go and watch this movie.
Go and watch this movie.

396
00:43:24 --> 00:43:30
Go and watch the movie by
yourselves. OK?

397
00:43:30 --> 00:43:34
I don't want to take the time to
watch it now.  It's an animation of

398
00:43:34 --> 00:43:38
what I've just told you.
There are some diagrams in your book.

399
00:43:38 --> 00:43:42
You can look at them.
They talk about things called A

400
00:43:42 --> 00:43:46
sites and P sites in the ribosome.
To me that is less important than

401
00:43:46 --> 00:43:50
you understand the actual
interactions between the tRNAs and

402
00:43:50 --> 00:43:54
the mRNAs.  Here is a circular RNA
with that poly A tail and the 5

403
00:43:54 --> 00:43:58
prime cap of binding proteins to
initiate translation

404
00:43:58 --> 00:44:02
as a circular RNA.
All right.  So,

405
00:44:02 --> 00:44:07
finally, let's move to this
complicated, I think fantastic

406
00:44:07 --> 00:44:12
bringing together of mutation from
genotype to phenotype.

407
00:44:12 --> 00:44:18
You've had a genetics module where
you talked about mutations,

408
00:44:18 --> 00:44:23
you talked about the genotype,
you talked about the phenotype.

409
00:44:23 --> 00:44:28
We've been throwing at you genotype
has got something to do with the DNA

410
00:44:28 --> 00:44:34
base sequence.
Phenotype has got something to do

411
00:44:34 --> 00:44:40
with the final product,
particularly the protein sequence.

412
00:44:40 --> 00:44:47
Let's explore that in a bit more
detail now and ask,

413
00:44:47 --> 00:44:53
what is the molecular basis for
changes in genotype and how do these

414
00:44:53 --> 00:45:00
correspond to changes in phenotype?
OK.  So genotype to phenotype.

415
00:45:00 --> 00:45:04
And I want to emphasize again that
phenotype is an outcome of a change

416
00:45:04 --> 00:45:09
in function of the final product of
a gene.  It isn't necessarily the

417
00:45:09 --> 00:45:14
same as the final product of a gene.
OK?  So, for example, a phenotype

418
00:45:14 --> 00:45:19
is giantism.  Someone who is very
tall.  The molecular basis for that

419
00:45:19 --> 00:45:24
could be multiple things.
It could be production of too much

420
00:45:24 --> 00:45:29
of a hormone, a protein called
growth hormone so that someone grows

421
00:45:29 --> 00:45:34
too tall or taller than normal.
OK?  So that is the phenotype is

422
00:45:34 --> 00:45:39
connected to the production of a
particular protein that's not the

423
00:45:39 --> 00:45:44
same as.  So here's another diagram,
something for you to think about.

424
00:45:44 --> 00:45:50
Mutations, almost anywhere in a gene,
can have an affect on the protein

425
00:45:50 --> 00:45:55
produced.  And there are two ways
the protein produced

426
00:45:55 --> 00:46:00
can be affected.
One is in the amount of protein and

427
00:46:00 --> 00:46:04
the other is in the sequence of the
protein produced.

428
00:46:04 --> 00:46:09
Now, if one gets a mutation in this
promoter region or often in the

429
00:46:09 --> 00:46:13
introns, but particularly the
promoter I've focused on,

430
00:46:13 --> 00:46:18
one can change the amount of RNA
that is being transcribed from a

431
00:46:18 --> 00:46:22
particular gene.
And that change in the amount of

432
00:46:22 --> 00:46:27
RNA will lead to a change in the
amount of protein.

433
00:46:27 --> 00:46:34
And you may get a phenotype because
you're making too little or too much

434
00:46:34 --> 00:46:41
protein.  Conversely,
changes in exons can lead to changes

435
00:46:41 --> 00:46:48
in the actual sequence,
the amino acid sequence of the

436
00:46:48 --> 00:46:55
protein and, therefore,
to its function.  So those are two

437
00:46:55 --> 00:47:02
important distinctions to make.
OK?  So mutations can change the

438
00:47:02 --> 00:47:10
amount or the sequence
of a protein.

439
00:47:10 --> 00:47:15
I'm going to go through some
examples of mutations,

440
00:47:15 --> 00:47:20
and you will go through more in
Section, and you are expected to

441
00:47:20 --> 00:47:25
know these changes and you are
expected to know how the change in

442
00:47:25 --> 00:47:30
DNA sequence may lead or not to the
change in protein sequence.

443
00:47:30 --> 00:47:34
So look carefully.
I'm not going to get through all my

444
00:47:34 --> 00:47:38
examples today.
You can go and you can do the

445
00:47:38 --> 00:47:42
examples that are posted on your
website.  You'll get more practice.

446
00:47:42 --> 00:47:47
And you really need to know this.
OK, so here's a wild type gene.

447
00:47:47 --> 00:47:51
The top two strands are the DNA.
The bottom of the strands is the

448
00:47:51 --> 00:47:55
template strand.
This DNA is transcribed into a mRNA

449
00:47:55 --> 00:48:00
and that is translated into the
protein indicated here.

450
00:48:00 --> 00:48:06
OK?  Let's look at an example of
what happens when there is a change

451
00:48:06 --> 00:48:13
in the DNA.  So here I've got a
change in the DNA,

452
00:48:13 --> 00:48:19
OK, such that this particular base
pair has been changed.

453
00:48:19 --> 00:48:26
The mRNA, oh, this is the wild type
again.  Here's your wild type

454
00:48:26 --> 00:48:33
sequence, wild type mRNA,
wild type protein.

455
00:48:33 --> 00:48:37
This is a class of change in the DNA
that is called a nonsense mutation.

456
00:48:37 --> 00:48:42
Watch carefully.  So at this
position that I've underlined,

457
00:48:42 --> 00:48:46
watch this.  Don't try to write
anything down.

458
00:48:46 --> 00:48:51
OK?  You'll have plenty of practice.
This is all posted.

459
00:48:51 --> 00:48:55
Just watch.  At this particular
underlying position,

460
00:48:55 --> 00:49:00
instead of a GC base pair there is
now an AT base pair.

461
00:49:00 --> 00:49:06
And that changes this codon UGG into
UAG.  And UAG happens to be a stop

462
00:49:06 --> 00:49:13
codon.  So here's your gene,
your mutant gene, here's your mRNA

463
00:49:13 --> 00:49:19
that comes from the mutant gene,
and here's the protein.  It starts

464
00:49:19 --> 00:49:26
OK with a methionine.
But, look, the next codon is a stop.

465
00:49:26 --> 00:49:32
OK?  So the protein is truncated.
Now, there are a number of classes

466
00:49:32 --> 00:49:36
of mutation.  I am going to write
these on the board.

467
00:49:36 --> 00:49:41
I'm going to ask you to go and read
your handout carefully.

468
00:49:41 --> 00:49:46
And you will cover these in section.
Again, you need to know them so let

469
00:49:46 --> 00:49:50
me list the types of mutation.
In the interest of time, I'm not

470
00:49:50 --> 00:49:55
going to go through them,
but you will be able to work through

471
00:49:55 --> 00:50:00
these examples both in Section
and on your own.

472
00:50:00 --> 00:50:05
So, to end off,
the mutations in exons that you

473
00:50:05 --> 00:50:11
should know are silent mutations
that don't change the sequence of

474
00:50:11 --> 00:50:16
the protein, nonsense mutations that
I've just covered,

475
00:50:16 --> 00:50:22
something called missense mutations
which change the amino acid sequence,

476
00:50:22 --> 00:50:27
and something called frameshift
mutations which also are likely to

477
00:50:27 --> 00:50:33
change the sequence
of the amino acid.

478
00:50:33 --> 00:50:37
OK?  As I say,
this will be covered.

479
00:50:37 --> 00:50:42
If you want to come and see me
personally in office hours tomorrow

480
00:50:42 --> 00:50:46
or the next day,
please do, and I'll go through these

481
00:50:46 --> 00:50:49
examples with you.

