CN550: Neural and Computational Models of Recognition, Memory, and Attention

Spring 2011 Syllabus

INSTRUCTORS:

Prof. Anatoli Gorchetchnikov

Office: Rm. 213, 677 Beacon Street

Office hours: Monday 2-4pm, Friday 1-2pm, or by appointment (email works best)

Email: anatoli (at) cns (dot) bu (dot) edu

Dr. Heather Ames

Office: Rm. 308B, 677 Beacon Street

Office hours: Monday 2-4pm or by appointment (email works best)

Email: starfly (at) cns (dot) bu (dot) edu

TEACHING ASSISTANT:

None

COURSE DESCRIPTION:

CN550 develops neural network models of how internal representations of sensory events and cognitive hypotheses are learned and remembered, and of how such representations enable recognition and recall of these events. Various neural and statistical pattern recognition models, and their historical development and applications, are analyzed. Special attention is given to stable self-organization of pattern recognition and recall by Adaptive Resonance Theory (ART) models. Mathematical techniques and definitions to support fluent access to the neural network and pattern recognition literature are developed throughout the course. Experimental data and theoretical analyses from cognitive psychology, neuropsychology, and neurophysiology of normal and abnormal individuals are also discussed. Course work emphasizes skill development, including writing, mathematics, computational analysis, teamwork, and oral communication.

CLASS PROJECT:

CN550 includes a class project, as described in the accompanying materials. Part of each class is devoted to discussion of the class project and planning for the coming week. Each student will work in a group with one or two other students. Groups should plan to meet during the weekly discussion session and at other times, as needed.

COMPUTATIONAL WORKSHOPS:

Each class will conclude with a computational workshop.

HOMEWORK:

For the second part of the class the class project serves as homework. For the first part of the class a phase plane assignment is intended as multi-week homework due at midterm.

GRADING CRITERIA:

Grades are determined by performance on:

1.      essay on readings -- 5%

2.      homework assignment -- 10%

3.      computational workshops -- 20%

4.      class project -- 20%

5.      midterm exam -- 20%

6.      final exam -- 25%

Participation in class discussions will play a role in determining the final letter grade in borderline cases.

Late homework policy: 10% penalty if turned in less than one week late, 20% penalty for 1-2 weeks late, and 30% penalty for > 2 weeks late. No late homework will be accepted after the final exam.

REQUIRED TEXT:

Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.

RECOMMENDED TEXTS:

Schacter, Daniel L. (1996) Searching for Memory: The Brain, the Mind, and the Past. New York: Basic Books. (paper).

Kandel, E., Schwartz, J.H., and Jessell, T.M. (2000). Principles of Neural Science, 4th Edition. New York: McGraw-Hill.

Levine, D.S. (2000). Introduction to Neural and Cognitive Modeling, 2nd Edition. Hillsdale, NJ: Erlbaum.

Strunk, William, Jr., & White E.B. (1959-2000) The Elements of Style, Fourth Edition. Needham Heights, MA: Allyn & Bacon.

OTHER USEFUL RESOURCES:

APA style: http://www.apastyle.org/, http://www.bridgewater.edu/WritingCenter/manual/APAformat.htm

Hettich, S., & Bay, S.D. (1999) The UCI KDD Archive. Irvine, CA: University of California, Department of Information and Computer Science. http://www.ics.uci.edu/~mlearn/MLRepository.html

CN 710 materials

Fall 2008http://cns.bu.edu/cn710/Fall2008/

Fall 2007http://cns.bu.edu/cn710/Fall2007/

Fall 2006http://cns.bu.edu/cn710/Fall2006/pmwiki.php?n=Main.HomePage

Spring 2006 http://cns.bu.edu/cn710/Spring2006/pmwiki.php?n=Main.HomePage

OTHER USEFUL TEXTS:

Too many to list here, please download the pdf.


SESSION 1 (January 24) Overview, history, philosophy, benchmark database studies

Course goals, topics, methods, assignments.

Historical review of principal neural network modules for learning, pattern recognition, and associative memory.

Class project: Comparative studies of supervised learning systems.

Benchmark database studies.

Readings:

Daugman, John G. (1990) Brain metaphor and brain theory. In Eric Schwartz (Ed.) Computational Neuroscience. Cambridge, Mass. : MIT Press. Chapter 2: pp. 9-18.

Borges, Jorge Luis (1942) Funes, the Memorious. In: Ficciones (translation), New York: Grove Press (1962), pp. 107-115.

Henig, Robin Marantz (2004) The quest to forget. The New York Times Magazine, April 4, 2004, pp. 32-37.

Treffert, Darold A., and Christensen, Daniel D. (2005) Inside the mind of a savant. Scientific American, Dec., pp. 108-113.

Carpenter, Gail A. (1989) Neural network models for pattern recognition and associative memory. Neural Networks, 2, 243-257.

McCulloch, Warren S., & Pitts, Walter (1943) A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5, 115-133.

Bower, Gordon H. (2000) A brief history of memory research. In Endel Tulving & Fergus I.M. Craik, (Eds.) The Oxford Handbook of Memory. New York: Oxford University Press, Chapter 1, pp. 3-32.

Smith, Edward E., & Medin, Douglas L. (1981) Categories and Concepts. Cambridge, Mass.: Harvard University Press. Chapters 1-2, pp. 1-21.

Grossberg, Stephen (1982) Studies of Mind and Brain. Boston: Reidel / Kluwer Publ. - Preface, Introduction, and prefaces of chapters 1-13.

Supplemental materials:

http://www.npr.org/templates/story/story.php?storyId=5352811

Unique memory lets woman replay life like a movie. Morning Edition, April 20, 2006 -- Neurobiologist James McGaugh, one of the world's experts on human memory, says that a woman he calls AJ has a one-of-a-kind memory. In an interview with NPR, she talks about what life is like for someone who can remember things she's done and news events from almost every day of her life for the past 25 years. Her life is like a split-screen movie, with the past running almost as vividly as the present.

Clive Wearing: Living without memory. YouTube (BBC -- The Mind): Pt2a Pt2b Pt2c Pt2d http://en.wikipedia.org/wiki/Clive_Wearing

Clive Alex Wearing (born 1938) is a British musicologist, conductor, and keyboardist suffering from an acute and long lasting case of anterograde amnesia. Specifically, this means he lacks the ability to form new memories, dubbed the "memento" syndrome by laypeople and the media, after a film based on the subject.

Lecture Notes:

PDF


SESSION 2 (January 31)

Supervised learning methods:-- Memory-based algorithms (KNN), model-independent supervised learning methods (validation & cross-validation, c-index, ROC curves, resampling, combining classifiers, component analysis), statistical pattern recognition.

Memory-based algorithms: K-nearest neighbors (K-NN)

Approaching supervised learning problems fairly and systematically

Training, testing, validation, and cross-validation

ROC curves and the c-index

Resampling:-- bootstrapping, boosting, bagging

Combining systems:-- mixing models and voting

Data preparation:-- component analysis

Brief introduction to statistical pattern recognition and Bayesian estimation

Readings:

Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.

1.      Section 2.8.3: Signal detection theory and operating characteristics, pp. 48-51.

2.      Sections 3.1-3.4, pp 84-97.

3.      Section 3.8: Component analysis and discriminants, pp. 114-124.

4.      Section 4.1-4.6:-- Nonparametric techniques, pp. 161-192.

5.      Section 9.4:-- Resampling for estimating statistics, pp. 471-475.

6.      Section 9.5:-- Resampling for classifier design, pp. 475-482.

7.      Section 9.6.2: Cross-validation , pp. 483-485.

8.      Section 9.7: Combining classifiers, pp. 495-499.

Carpenter, Gail A., Grossberg, Stephen, Markuzon, Natalya, Reynolds, John H., & Rosen, David B. (1992) Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps.IEEE Transactions on Neural Networks, 3, 698-713.

http://en.wikipedia.org/wiki/Resampling_%28statistics%29

http://en.wikipedia.org/wiki/Bootstrapping_%28statistics%29

http://en.wikipedia.org/wiki/Bootstrap_aggregating

http://en.wikipedia.org/wiki/Boosting

http://en.wikipedia.org/wiki/Principal_Component_Analysis

http://en.wikipedia.org/wiki/Fisher_linear_discriminant

http://en.wikipedia.org/wiki/Maximum_likelihood

http://en.wikipedia.org/wiki/Bayes%27_theorem

Lecture Notes:

PDF


SESSION 3 (February 7) Unsupervised learning: Clustering (leader, K-means), competitive learning, ART

Clustering algorithms: Leader clustering and K-means clustering

Norms and metrics

Competitive learning

Adaptive resonance theory - 1970s

ART 1:-- Binary pattern learning

ART 2-A:-- A fast, algorithmic version of ART 2

Freud's neural networks

Readings:

Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley. Section 10.4.3: k-means clustering, pp. 526-528.

Levine, Daniel S. (2000) Introduction to Neural and Cognitive Modeling. Hillsdale, NJ: Lawrence Erlbaum Associates, 2nd Edition.

1.      Chapter 4: Competition, lateral inhibition, and short-term memory, pp. 95-154

2.      Chapter 6: Coding and categorization, pp. 198-279

Malsburg, Christoph von der (1973) Self-organization of orientation sensitive cells in the striate cortex. Kybernetik, 14, 85-100.

Grossberg, Stephen (1976) Adaptive pattern classification and universal recoding, I: Parallel development and coding of neural feature detectors. Biological Cybernetics, 23, 121-134.

Grossberg, Stephen (1976) Adaptive pattern classification and universal recoding, II: Feedback, expectation, olfaction, and illusions. Biological Cybernetics, 23, 187-202.

Carpenter, Gail A., & Grossberg, Stephen (1987) A massively parallel architecture for a self-organizing neural pattern recognition machine. Computer Vision, Graphics, and Image Processing, 37, 54-115.

Moore, Barbara (1989) ART 1 and pattern clustering. In David S. Touretzky, Geoffrey Hinton, & Terrence Sejnowski (Eds.) Proceedings of the 1988 Connectionist Models Summer School. San Mateo, Calif.: Morgan Kaufmann Publishers. pp. 174-185.

Carpenter, Gail A., Grossberg, Stephen, & Rosen, David B. (1991) ART 2-A: An Adaptive Resonance algorithm for rapid category learning and recognition. Neural Networks, 4, 493-504.

Freud, Sigmund (1886-1899) Project for a Scientific Psychology. pp. 322-325. (1900) The Interpretation of Dreams. Introduction by James Strachey (Editor and translator). New York: Avon Books (1965).

http://en.wikipedia.org/wiki/K-means

Lecture Notes:

PDF


SESSION 4 (February 14) Dimensional analysis, competitive networks, phase plane analysis

Dimensional analysis

Dynamics of on-center off-surround shunting competitive networks

Phase plane analysis of competitive networks

Readings:

Lin, C.C., & Segel, L.A. (1974) Mathematics Applied to Deterministic Problems in the Natural Sciences. New York: Macmillan.

1.      Chapter 6: Simplification, dimensional analysis, and scaling, pp. 185-224

Edelstein-Keshet, Leah (1988) Mathematical Models in Biology. SIAM Classics in Applied Mathematics, vol. 46.

1.      Section 4.3: Formulating a model

2.      Section 4.4: Saturating nutrient consumption rate

3.      Section 4.5: Dimensional analysis of the equations

4.      Sections 5.2-5.9: Phase-plane methods and qualitative solutions, pp. 171-193

Boston University Ordinary Differential Equations Project: http://math.bu.edu/odes/.

1.      Section 3.3: Phase planes for linear systems with real eigenvalues,-- pp. 266-282.

2.      Section 5.2: Qualitative analysis, pp. 457-470.

3.      Section 5.3: Hamiltonian systems, pp. 470-488.

4.      Section 5.4: Dissipative systems, pp. 488-510.

http://en.wikipedia.org/wiki/Phase_plane

http://en.wikipedia.org/wiki/Dimensional_analysis

http://en.wikipedia.org/wiki/Hamiltonian_mechanics

http://en.wikipedia.org/wiki/Dissipative

Lecture Notes:

PDF


SESSION 5 (February22 Attention! Tuesday Class!) ARTMAP

Fuzzy ART:-- Generalized ART 1, for analog inputs, using the city-block metric (L1 norm)

Supervised learning by ART systems

Binary ARTMAP

Analog fuzzy ARTMAP

Readings:

Carpenter, Gail A., Grossberg, Stephen, & Rosen, David B. (1991) Fuzzy ART: Fast stable learning and categorization of analog patterns by an Adaptive Resonance system. Neural Networks, 4, 759-771.

Carpenter, Gail A., Grossberg, Stephen, & Reynolds, John H. (1991) ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network. Neural Networks, 4, 565-588.

Carpenter, Gail A., Grossberg, Stephen, Markuzon, Natalya, Reynolds, John H., & Rosen, David B. (1992) Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Transactions on Neural Networks, 3, 698-713.

Carpenter, Gail A. (2003). Default ARTMAP. Proceedings of the International Joint Conference on Neural Networks (IJCNN--03), Portland, Oregon, 1396-1401.

Frey, Peter W., & Slate, David J. (1991) Letter recognition using Holland--style adaptive classifiers. Machine Learning, 6, 161--182.

Zadeh, Lotfi A. (1965) Fuzzy sets. Information Control, 8, 338-353.

http://en.wikipedia.org/wiki/Fuzzy_sets

http://en.wikipedia.org/wiki/Fuzzy_logic

Lecture Notes:

PDF


SESSION 6 (February 28) Associative memory networks: Back propagation, multi-layer perceptrons, radial basis functions, cascade-correlation, higher-order networks

Back propagation

Multi-layer perceptrons

(Local) minimization of cost functions

Radial basis functions (RBFs)

Cascade-correlation architecture

Higher order networks

Readings:

Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.

1. Sections 6.1-6.8: Multilayer neural networks, pp. 282-318.

2. Section 6.10.1: Radial basis function networks (RBFs), pp. 324-325.

3. Section 6.10.6: Cascade-correlation, pp. 329-330.

Fahlman, Scott E., & Lebiere, Christian (1990) The cascade-correlation learning architecture. In David S. Touretzky (Ed.) Neural Information Processing Systems 2, Proceedings of the NIPS Conference, Denver, 1989, San Mateo, Calif.: Morgan Kaufmann Publishers. pp. 524-532.

Giles, C. Lee, & Maxwell, Thomas (1987) Learning, invariance, and generalization in high-order neural networks. Applied Optics, 26, 4972-4978.

Lowe, David (2003) Radial basis function networks. In Arbib, Michael A. (2003) The Handbook of Brain Theory and Neural Networks, Second Edition. Cambridge, Mass.: MIT Press. pp. 937-940.

Moody, John, & Darken, Christian J. (1989) Fast learning in networks of locally-tuned processing units. Neural Computation, 1, 281-294.

Rosenblatt, F. (1958) The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65, 386-408.

Rumelhart, David E., Hinton, Geoffrey E., & Williams, Ronald J. (1986) Learning internal representations by error propagation. In David E. Rumelhart & James L. McClelland (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognition-- I. Cambridge, Mass.: MIT Press. pp. 318-362.

http://en.wikipedia.org/wiki/Cascade_correlation

Lecture Notes:

PDF


SESSION 7 (March 7) Support vector machines

Support vector machines (SVMs)

Constrained optimization

Lagrange multipliers

Readings:

Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.

1.      Section A.3: Lagrange optimization, p. 610.

2.      Section 5.11: Support vector machines, pp. 259-265.

Vapnik, Vladimir N. (1998) Statistical Learning Theory. New York: John Wiley.

1.      Section 9.5: Three theorems of optimization theory, pp. 390-394.

2.      Chapter 10: The support vector method for estimating indicator functions, pp. 401 -- 441.

Strang, Gilbert (1988) Linear Algebra and its Applications, Third Edition. New York: Harcourt Brace Jovanovich College Publishers. Section 8.3: The theory of duality, pp. 412-423.

Bartlett, Peter L., & Maass, Wolfgang (2003) Vapnik-Chervonenkis dimension of neural nets. In Arbib, Michael A. (2003) The Handbook of Brain Theory and Neural Networks, Second Edition.-- Cambridge, Mass.: MIT Press. pp. 1188-1192

Bishop, Christopher M. (2006) Pattern Recognition and Machine Learning. Springer. Appendix E:-- Lagrange multipliers, pp. 707-710.

http://en.wikipedia.org/wiki/Optimization_%28mathematics%29

http://en.wikipedia.org/wiki/Dual_space

http://en.wikipedia.org/wiki/Constrained_Optimization_and_lagrange_Multipliers

http://en.wikipedia.org/wiki/Support_vector_machines

Bennett and Campbell paper

A very good 4.5 hour video lecture given at the Machine Learning Summer School in 2006 by Chih-Jen Lin

Lecture Notes:

PDF


SESSION 8 (March 21) Mid-term Exam

Phase plane assignment is due before the exam begins.


SESSION 9 (March 28) Physiology, psychology, and memory

Neural substrates of memory

Cortical organization

Neuropsychology of memory and amnesia

Neurobiology of chemical synapses, neuromodulators, and short-term synaptic plasticity

Synaptic modification

Retrograde messengers

Readings:

Bear, Mark F., Connors, Barry W., & Paradiso, Michael A. (1996) Neuroscience: Exploring the Brain. Baltimore: Williams & Wilkins. Chapter 19: Memory systems, pp. 514-545.

Levine, Daniel S. (2000) Introduction to Neural and Cognitive Modeling. Hillsdale, NJ: Lawrence Erlbaum Associates, 2nd Edition. Appendix 1: Basic Facts of Neurobiology, pp. 375-395.

Corkin, Suzanne (2002) What's new with the amnesic patient H.M.? Nature Reviews - Neuroscience, 3, 153-160

Freedman, David J., Riesenhuber, Maximilian, Poggio, Tomaso, & Miller, Earl K. (2003) A comparison of primate prefrontal and inferior temporal cortices during visual categorization. Journal of Neuroscience, 23, 5235-5246.

Kandel, Eric R., Schwartz, James H., & Jessell, Thomas P. (Eds.) (2000) Principles of Neural Science. 4th Edition. New York: McGraw-Hill. Chapter 63: Eric R. Kandel. Cellular mechanisms of learning and the biological basis of individuality. pp. 1247-1279.

Kandel, Eric R., Schwartz, James H., & Jessell, Thomas P. (Eds.) (2000) Principles of Neural Science. 4th Edition. New York: McGraw-Hill, pp. 175-186. Chapter 10: Eric R. Kandel & Steven A. Siegelbaum. Overview of synaptic transmission.

Malenka, Robert C., & Nicoll, Roger A. (1999) Long-term potentiation - a decade of progress? Science, 285, 1870-1874.

Zucker, Robert S. (1989) Short-term synaptic plasticity. Annual Review of Neuroscience, 12, pp. 13-31.

Atkinson, Richard C., & Shiffrin, Richard M. (1971) The control of short-term memory. Scientific American, 82-90.

http://en.wikipedia.org/wiki/Synapse

http://en.wikipedia.org/wiki/Retrograde_signaling_in_LTP

http://en.wikipedia.org/wiki/Nitric_oxide

http://videocast.nih.gov/podcast.asp?13746

http://en.wikipedia.org/wiki/Long-term_potentiation

http://en.wikipedia.org/wiki/HM_%28patient%29

http://en.wikipedia.org/wiki/Neuron

http://en.wikipedia.org/wiki/Cerebral_cortex

http://en.wikipedia.org/wiki/Atkinson-Shiffrin_memory_model

Lecture Notes:

PDF


SESSION 10 (April 4) Decision Trees

Decision trees

Readings:

Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.

1. Chapter 8: Nonmetric Methods, pp. 394-436.

Lecture Notes:

PDF


SESSION 11 (April 11) Liapunov Functions, Cohen-Grossberg Theorem, Hierarchical Temporal Memories

Liapunov functions and the LaSalle invariance principle

The Cohen-Grossberg theorem

Hierarchical temporal memories (guest lecture by John Agapiou)

Readings:

Cohen, M. and Grossberg, S. (1983). Absolute stability of global pattern formation and parallel memory storage by competitive neural networks. IEEE Transactions on Systems, Man, and Cybernetics, 13, pp. 815-826.

Grossberg, Stephen (1988) Nonlinear neural networks: Principles, mechanisms, and architectures. Neural Networks, 1, 17-61. Section 9 - Content-addressable memory storage: a general STM model and Liapunov method, pp. 24 - 30.

George, D. and Hawkins, J. (2009). Towards a mathematical theory of cortical micro-circuits. PLOS: Computational Biology, 5:10, e1000532.

Brauer, Fred, & Nohel, John (1969). The qualitative theory of differential equations. W.A. Benjamin. Sections 5.1 and 5.2.

Chris Bishop (2006). Pattern Recognition and Machine Learning. Section 8.4.

Lecture Notes:

PDF

Guest Lecture PDF (not here yet)


SESSION 12 (April 21 Attention! Thursday Class!) Boltzmann Machines; Genetic Algorithms

Readings:

Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley. Chapter 7: Stochastic Methods, pp. 350-393.

Lecture Notes:

PDF


SESSION 13 (April 25) Invariance, Integral Transforms, Moments

Invariant pattern recognition

Fourier analysis

Log-polar-Fourier filter

Algebraic invariance

Requirements for invariant pattern recognition system

Readings:

Cavanagh, Patrick (1984) Image transforms in the visual system. In Peter C. Dodwell & Terry Caelli (Eds.) Figural Synthesis. Hillsdale, NJ: Lawrence Erlbaum Associates. pp. 185-218.

Wood, Jeffrey. (1996) Invariant pattern recognition: A review. Pattern Recognition, 29(1), 1-17.

http://en.wikipedia.org/wiki/Fourier_transform

http://en.wikipedia.org/wiki/Complex_logarithm

http://en.wikipedia.org/wiki/Image_moment

http://en.wikipedia.org/wiki/Zernike_polynomials

Lecture Notes:

PDF


SESSION 14 (May 2) Class Project Student Presentations.


FINAL EXAM Date: May 10th at 5PM