Amino Acids and Proteins: Structure, Peptide Bonds, Synthesis, and...

Organic Chemistry Organic Chemistry 8 min read 1543 words Beginner ExcellentWiki Editorial Team

Proteins are the molecular machines of life. They catalyze reactions, transport molecules, provide structural support, and mediate cellular communication. The building blocks of proteins — amino acids — are simple molecules of extraordinary versatility. Twenty standard amino acids, linked in sequences of hundreds to thousands, create the vast diversity of protein structures and functions. The 2018 Nobel Prize in Chemistry, awarded for the directed evolution of enzymes, and the 2024 Nobel Prize in Chemistry, awarded for computational protein design, underscore the central importance of protein chemistry in modern science.

Structure and Classification of Amino Acids

Each amino acid has a central alpha carbon bonded to an amino group, a carboxyl group, a hydrogen atom, and a variable side chain. At physiological pH, the amino group is protonated as NH3+ and the carboxyl group is deprotonated as COO-, giving amino acids zwitterionic character. The pKa of the alpha-carboxyl group is approximately 2.3, and the pKa of the alpha-amino group is approximately 9.6. The isoelectric point — the pH at which the molecule has no net charge — varies with side chain properties.

Side Chain Classification

Nonpolar, hydrophobic amino acids include glycine, alanine, valine, leucine, isoleucine, methionine, proline, and phenylalanine. These amino acids tend to cluster in the interior of folded proteins, away from water. Glycine is the simplest amino acid with only a hydrogen side chain — the lack of steric hindrance allows glycine to adopt conformations that other amino acids cannot. Proline is unique among amino acids — the side chain forms a cyclic structure with the amino group, introducing rigidity into the peptide backbone.

Polar, uncharged amino acids include serine, threonine, cysteine, asparagine, glutamine, and tyrosine. The side chains of these amino acids can form hydrogen bonds. Cysteine is distinctive — its thiol group can form disulfide bonds with another cysteine, creating covalent cross-links that stabilize protein structure. The pKa of the cysteine thiol is approximately 8.3, making it reactive under physiological conditions.

Positively charged amino acids — lysine, arginine, and histidine — have basic side chains. Lysine’s epsilon-amino group has pKa approximately 10.5. Arginine’s guanidinium group has pKa approximately 12.5. Histidine’s imidazole side chain has pKa approximately 6.0, making histidine the only amino acid that can change protonation state near physiological pH — a property exploited in enzyme active sites.

Negatively charged amino acids — aspartic acid and glutamic acid — have carboxyl groups in their side chains with pKa values around 4.0. At physiological pH, these side chains are negatively charged. The conjugate bases are called aspartate and glutamate.

Stereochemistry

All standard amino acids except glycine are chiral, with the L-configuration being the biologically relevant form. The absolute configuration is S for all standard amino acids except cysteine, which is R due to the higher priority of sulfur in the Cahn-Ingold-Prelog system. The evolutionary choice of L-amino acids over D-amino acids is one of the great mysteries of biochemistry — both forms are equally likely from a chemical perspective.

Peptide Bond Formation

The peptide bond — the amide bond connecting amino acids — has unique properties. The peptide bond is planar due to resonance between the carbonyl pi system and the nitrogen lone pair. The C-N bond has partial double-bond character — approximately 1.33 angstroms versus 1.47 angstroms for a single bond — restricting rotation and giving the peptide bond a rigid, planar structure. The carbonyl oxygen and the amide hydrogen are trans in almost all peptide bonds.

The planarity and trans configuration of the peptide bond impose constraints on protein folding. The Ramachandran plot maps the allowed phi and psi angles for the peptide backbone. Certain combinations of phi and psi are sterically forbidden — the plot reveals the allowed regions corresponding to alpha helices, beta sheets, and turns.

Solid-Phase Peptide Synthesis

The chemical synthesis of peptides and small proteins is accomplished through solid-phase peptide synthesis, developed by Bruce Merrifield in 1963. The method revolutionized peptide chemistry and earned Merrifield the 1984 Nobel Prize.

The Merrifield Approach

The C-terminal amino acid is attached to an insoluble resin bead through a cleavable linker. The amino group is protected with a temporary protecting group — typically Fmoc or Boc. Deprotection reveals the free amino group, which is coupled to the next protected amino acid. The cycle of deprotection and coupling is repeated until the desired sequence is assembled. The peptide is then cleaved from the resin and deprotected.

Protecting Groups

The Fmoc strategy uses a base-labile protecting group — Fmoc is removed with piperidine. The Boc strategy uses an acid-labile protecting group — Boc is removed with trifluoroacetic acid. Side chain protecting groups must be stable to the deprotection conditions used for the temporary protecting group. The choice between Fmoc and Boc strategies depends on the peptide sequence and the target application.

Coupling Reagents

Amide bond formation requires activation of the carboxyl group. Carbodiimide reagents — DCC and DIC — form an O-acylisourea intermediate that reacts with the amino group. Additives like HOBt suppress racemization and improve yield. Modern coupling reagents — HATU, HBTU, and PyBOP — form active esters that react rapidly with the amino group.

Protein Structure

Primary Structure

The primary structure of a protein — the linear sequence of amino acids — determines all higher levels of organization. The sequence is encoded in the genetic code and is read from the N-terminus to the C-terminus. The relationship between sequence and structure — the protein folding problem — has been addressed by AlphaFold and related computational methods that predict three-dimensional structure from sequence with remarkable accuracy.

Secondary Structure

Alpha helices and beta sheets are the most common secondary structural elements. The alpha helix has 3.6 amino acids per turn, with hydrogen bonds between the carbonyl of residue n and the NH of residue n plus 4. The helix is stabilized by the cumulative effect of many hydrogen bonds. Beta sheets consist of extended polypeptide strands linked by hydrogen bonds — parallel sheets have strands running in the same direction, while antiparallel sheets have strands running in opposite directions.

Tertiary Structure

Tertiary structure describes the three-dimensional arrangement of secondary structural elements. Hydrophobic interactions drive the folding of nonpolar side chains into the protein interior. Hydrogen bonds, ionic interactions, and van der Waals forces contribute additional stability. Disulfide bonds between cysteine residues provide covalent stabilization, particularly in secreted proteins.

Quaternary Structure

Quaternary structure describes the arrangement of multiple polypeptide chains. Hemoglobin — a tetramer of two alpha and two beta subunits — is the classic example. The quaternary structure enables cooperativity in oxygen binding — the binding of oxygen to one subunit increases the oxygen affinity of the remaining subunits.

Post-Translational Modifications

Proteins are modified after translation to achieve their functional forms. Phosphorylation — addition of phosphate to serine, threonine, or tyrosine — regulates enzyme activity and signal transduction. Glycosylation — addition of carbohydrate chains to asparagine or serine/threonine — affects protein folding, stability, and cell-surface recognition. Acetylation, methylation, and ubiquitination modify protein function and stability. These modifications expand the functional diversity of the twenty standard amino acids and create additional recognition motifs for protein-protein interactions.

Enzyme Mechanisms

Enzymes catalyze reactions with remarkable rate accelerations — up to 10^17-fold over uncatalyzed reactions. The active site provides a specific environment that stabilizes the transition state. Acid-base catalysis uses amino acid side chains as proton donors and acceptors — histidine is particularly versatile because its pKa is near physiological pH. Covalent catalysis involves formation of a transient covalent bond between the enzyme and substrate — serine proteases form an acyl-enzyme intermediate during peptide bond hydrolysis. Metal ion catalysis uses bound metal ions to activate substrates or stabilize intermediates.

Protein Folding and Misfolding

Protein folding is the process by which a linear polypeptide chain adopts its native three-dimensional structure. The folding process is guided by the hydrophobic effect and the formation of specific hydrogen bonds and van der Waals contacts. Levinthal’s paradox states that a protein cannot sample all possible conformations — the folding time for a 100-residue protein is microseconds, while random sampling would take longer than the age of the universe.

Chaperone proteins assist folding by preventing aggregation and providing a protected environment for folding. Misfolded proteins can form aggregates — amyloid fibrils — associated with neurodegenerative diseases including Alzheimer’s, Parkinson’s, and Huntington’s. The prion diseases — Creutzfeldt-Jakob disease in humans and bovine spongiform encephalopathy in cattle — involve the transmission of misfolded protein conformations.

Frequently Asked Questions

How many standard amino acids are there? There are twenty standard amino acids encoded by the genetic code. Two additional amino acids — selenocysteine and pyrrolysine — are incorporated into proteins in specific organisms under special circumstances.

What determines the structure of a protein? The amino acid sequence determines the three-dimensional structure. Interactions among amino acid side chains — hydrophobic effects, hydrogen bonds, electrostatic interactions, and van der Waals forces — drive folding. The cellular environment and chaperone proteins influence the folding process.

How are disulfide bonds formed in proteins? Disulfide bonds form by oxidation of two cysteine thiol groups. In the oxidizing environment of the endoplasmic reticulum, disulfide bond formation is catalyzed by protein disulfide isomerase. Disulfide bonds provide stability to secreted proteins and are reduced in the reducing environment of the cytoplasm.

Stereochemistry Guide — Carboxylic Acids and Derivatives — Amine Chemistry

Share this article

X LinkedIn Facebook Email