Ivy’s Digital Garden
1
Journal Club
1.1
NINJ1 mediates plasma membrane rupture during lytic cell death
1.2
Adoptive cellular therapy with T cells expressing the dendritic cell growth factor Flt3L drives epitope spreading and antitumor immunity
1.3
Structural cells are key regulators of organ-specific immune responses
1.4
Genome-wide CRISPR–Cas9 screening reveals ubiquitous T cell cancer targeting via the monomorphic MHC class I-related protein MR1
1.5
Intratumoral Activity of the CXCR3 Chemokine System Is Required for the Efficacy of Anti-PD-1 Therapy
2
Immunology
3
Protein Engineering
4
Sequencing
5
Data Science
5.1
SetUp
How to setup VSCode
Weird behaviors:
5.2
Statistical Inference:
5.2.1
Why Statistical Inference
5.2.2
Statistical Learning
5.2.3
Linear Regression
5.2.4
Classification
5.2.5
Resampling Methods
5.2.6
Linear Model Selection and Regularization
5.2.7
Moving Beyond linearity
5.2.8
Tree-Based Methods
5.2.8.1
The Basics of Decision Trees
5.2.8.1.1
Regression Trees
5.2.8.1.2
Prediction via Strartification of the Feature Space
5.2.8.1.3
Classification Trees
5.2.8.1.4
Trees Versus Linear Models
5.2.8.1.5
Advantages and Disadvantages of Trees
5.2.8.2
Bagging, Random Forest, Boosting
5.2.8.2.1
Bagging
5.2.8.2.2
Random Forest
5.2.8.2.3
Boosting
5.2.9
Unsupervised learning
5.2.9.1
Challenges in unsupervised learning
5.2.9.2
Principal Components Analysis (PCA)
5.2.9.2.1
What are principal components?
5.2.9.2.2
Another interpretation of Principle Components
5.2.9.2.3
The Proportion of Variance Explained (PVE) in PCA
5.2.9.2.4
Uniqueness of the Principal Components
5.2.9.2.5
Deciding How Many Principal Components to Use
5.2.9.2.6
Other uses for principal components
5.2.9.3
Clustering methods
5.2.9.3.1
K-means clustering
5.2.9.3.2
Hierarchial clustering
5.2.9.3.3
Practical Issues in Clustering
5.2.10
Support Vector Machines
5.2.10.1
Maximal Margin Classifier
5.2.10.1.1
What is a hyperplane?
5.2.10.1.2
Classification Using a Seperating hyperplane
5.2.10.1.3
Maximal Margin Classifier
5.2.10.1.4
Construction of the Maximal Margin Classifier
5.2.10.1.5
The Non-separable Case
5.2.10.2
Support Vector Classifiers
5.2.10.2.1
Overview of the Support Vector Classifier
5.2.10.2.2
Details of the Support Vector Classifier
5.2.10.3
Support Vector Machines
5.2.10.3.1
Classification with Non-Linerar Decision Boundaries
5.2.10.3.2
The Supoort Vector Machines
5.2.10.3.3
An Application to the Heart Disease Data
5.2.10.4
SVM with More than Two classes
5.2.10.4.1
One-Versus-One Classification
5.2.10.4.2
One-Versus-All Classification
5.2.10.5
Relationship to Logistic Regression
5.2.11
Deep Learning
5.2.11.1
Single Layer Neural Network
5.2.11.2
Multilayer Neural networks
5.2.11.3
Convolution Neural Networks
5.2.11.3.1
Convolution Layer
5.2.11.3.2
Pooling layers
5.2.11.3.3
Architecture of a convolution neural network
5.2.11.3.4
Data Augmentation
5.2.11.3.5
Results Using a Pretrained Classifier
5.2.11.4
Document Classification
5.2.11.4.1
Reccurent Neural networks
5.2.11.5
When to Use Deep Learning
5.2.11.6
Fitting a Neural Network
5.2.11.6.1
Backpropagation
5.2.11.6.2
Regulaization and Stochastic Gradient Descent
5.2.11.6.3
Dropout Learning
5.2.11.6.4
Network Tuning
5.2.11.7
Interpolation and Double Descent
5.3
Bayesian Statistics:
5.3.1
Properties of Conditional Probability
5.3.2
Application of Bayes Theorem: Examples
5.3.3
Independence
5.3.4
Random Variables; Joint Distributions; Law of Large Numbers (LLN); Central Limit Theorem (CLT)
5.3.5
Common Probability Distributions; Introduction to Bayesian Inference; One-parameter Models
5.3.6
Exponential Family; Frequentist Confidence Interval; Bayesian (Credible) Interval
5.3.7
Sufficiency; Rao-Blackwell Theorem;
5.3.8
Monte Carlo Approximation
5.3.8.1
Monte Carlo Expectation
5.3.8.2
Additional information from Monte Carlo approximation, other than estimating parameters
5.3.8.3
Understanding discrepencies
5.3.9
Normal Model
5.3.9.1
Conjugate analysis
5.3.9.2
Precision and combining information
5.3.9.3
Prediction for a new observation
5.3.9.4
Joint inference for the mean and variance
5.3.9.5
Posterior inference
5.3.9.6
Monte Carlo sampling
5.3.9.7
Improper Priors
5.3.9.8
Bias, Variance, and Mean Squared Error (MSE);
5.3.9.9
Prior specification based on expectations
5.3.9.10
The normal model on non-normal data
5.3.10
Posterior approximation with the Gibbs sampler.
5.3.10.1
A semiconjugate prior distribution
5.3.10.2
Discrete approximations
5.3.10.3
Sampling from the conditional distributions
5.3.10.4
Gibbs sampling
5.3.10.5
General properties of the Gibbs sampler
5.3.10.6
Introduction to MCMC diagnostics
5.3.11
Multivariate Normal Model
5.3.12
Group comparisons and hierarchial modeling
5.3.13
Linear Regression
5.3.14
Nonconjugate priors and Metropolis-Hastings algorithm
5.3.14.1
Irreducibility, Aperiodicity, and Recurrency
5.3.14.2
Ergodic Theorem
5.4
CS50
5.4.1
Introduction
5.4.1.1
Unicode
5.4.1.2
Color
5.4.1.3
Algorithms
5.4.1.4
Artificial Intelligence
5.4.2
C
5.4.2.1
Running CS50 locally
5.4.2.2
Source code
5.4.2.3
From Scratch to C
5.4.2.4
Header Files
5.4.2.5
Hello, You
5.4.2.6
Terminal Commands
5.4.2.7
Types
5.4.2.8
Conditionals
5.4.2.9
Variables and compare.c
5.4.2.10
agree.c
5.4.2.11
Loops cat.c
5.4.2.12
Functions
5.4.2.13
Correctness, Design, Style
5.4.2.14
Mario
5.4.2.15
calculator.c
5.4.2.16
Integer Overflow
5.4.2.17
Boeing
5.4.2.18
Pacman
5.4.2.19
Truncation
5.4.2.20
Type Casting
5.4.2.21
Floating-Point Imprecision
5.4.3
Arrays
5.4.4
Memory
5.4.5
Data Structures
5.4.6
Python
5.4.7
Artificial Intelligence
5.4.8
SQL
5.4.9
HTML, CSS, JavaScript
5.4.10
Flask
5.4.11
The End
5.4.12
ACE Reccomendation
5.5
Deep Learning
5.5.1
9/5 Introduction to Neural Networks and Deep Learning,
5.5.2
9/12 Gradient Descent and Back Propagation
5.5.3
9/19 Keras
5.5.4
9/26 Convolutional Neural Networks (CNNs)
5.5.5
10/3 Visualizing Feature Maps of CNN Layers, Locating Objects in Images
5.5.6
10/10: Transfer Learning, Fine Tuning, Augmentation
5.5.7
10/17 Autoencoders, Variational Autoencoders and Manifold Hypothesis
5.5.8
10/24 Natural Language Processing (NLP), Doc2Vec like API-s and
5.5.9
10/31 Analysis and Transcription of Speech
5.5.10
11/7 Sequence Analysis, Seq2Seq Models and Machine Translation
5.5.11
11/14 Transformers
5.5.12
11/21: Class 12: Large Language Models (LLMs)
5.5.13
12/5: Class 13: Generative Adversarial Networks (GANs)
5.5.14
12/12: Class 14: Graphs Neural Networks (GNNs)
5.5.15
12/19: The Final Project Presentations
5.6
Data Mining, Discovery, and Exploration
5.6.1
9/3 Course introduction and introduction to data mining
5.6.1.1
What is data mining?
5.6.1.2
Goals of data mining
5.6.1.3
Limitations
5.6.1.4
Bonferroni’s Principle
5.6.2
Important words
5.6.2.1
What is a hash function, a hash table?
5.6.2.2
Hashing for managing massive data sets
5.6.2.3
Scalable hypothesis test algorithms for data mining
5.6.3
9/10 Data mining massive and streaming data, Part 1
5.6.3.1
The nature of streaming data
5.6.3.2
Architectures for streaming data
5.6.3.3
Filters and counting for streams
5.6.3.4
Updating statistics with streams
5.6.3.5
Quantile estimation for streams
5.6.4
9/17 Data mining massive and streaming data, Part 2
5.6.5
9/24 Mining social-network graphs
5.6.5.1
Nature of social-network graphs
5.6.5.2
Centrality and influence
5.6.5.3
Clustering graphs
5.6.5.4
Spectral decomposition of graphs
5.6.5.5
Overlapping communities – time permitting
5.6.6
10/1 Similarity search at massive scale, Part 1
5.6.6.1
Distance measures
5.6.6.2
Similarity measures
5.6.6.3
Search with KD-trees
5.6.6.4
Approximate similarity search at massive scale
5.6.6.5
Indexes for massive similarity serarch
5.6.6.6
Evaluation of approximate similarity search algorithms
5.6.7
10/8 Similarity search at massive scale, Part 2
5.6.8
10/15 Information retrieval for document and web search
5.6.8.1
Dense vs. sparse embedding for documents search
5.6.8.2
Dense neural embedding algorithms
5.6.8.3
Sparse embedding algorithms
5.6.8.4
Term expansion algorithms
5.6.8.5
Embedding images and CLIP
5.6.9
10/22 Learning to rank for search and recommendation, Part 1
5.6.9.1
Embedding high cardinality features
5.6.9.2
Content based methods
5.6.9.3
Collaborative filtering
5.6.9.4
Latent variable algorithms
5.6.9.5
The PageRank algorithm
5.6.9.6
Priority queues and heaps
5.6.10
10/29 Learning to rank for search and recommendation, Part 2
5.6.11
11/5 Clustering Algorithms, Part 1
5.6.11.1
Finding structure in data
5.6.11.2
The k-means algorithm
5.6.11.3
Problems with evaluating clustering results
5.6.11.4
Probabilistic clustering, dealing with uncertainty
5.6.11.5
Agglomerative hierarchical clustering and k-medoids algorithms for non-Euclidean spaces
5.6.11.6
Density-based clustering algorithms
5.6.11.7
Graph-based clustering algorithms
5.6.11.8
Database scale clustering algorithms
5.6.12
11/12 Clustering Algorithms, Part 2
5.6.13
11/19 Dimensionality reduction
5.6.13.1
Eigenvalues and PCA
5.6.13.2
Singular value decomposition and its interpretation
5.6.13.3
Manifold learning and nonlinear spaces
5.6.13.4
Spectral embedding
5.6.13.5
The UMAP algorithm
5.6.14
11/23
5.6.15
12/3 Frequent item sets; market basket analysis
5.6.15.1
Market basket models
5.6.15.2
The A-Priori algorithm
5.6.15.3
Limited pass algorithms
5.6.15.4
Streaming algorithms – time permitting
5.6.16
12/10 Additional Topics
5.6.17
12/17: Final Project Due
Perpetually trying to remember what I forgot
Chapter 3
Protein Engineering
Coming soon