Welcome to the Math and Stats Independent Comps Presentations for Winter 2021! This term students are recording comps presentations that are viewable by a prerecorded video. Please refrain from posting these videos to any other online resource. Independent comps students will be hosting a Q&A session by Zoom on March 4 and 5 at the times listed below. Q&A questions submitted by 2 pm the day of the presentation will receive full consideration. The Zoom link will be emailed prior to the event. We hope you enjoy the comps presentations.

## Thursday, March 4, 2021

**Time: 4:00-4:20 pmTitle: Matrix Completion Problem: nuclear norm minimization approach**Speaker: Varit Bhanijkasem

Comps Presentation Video

Abstract: Starting with an incomplete matrix, is it possible to miraculously figure out what exactly are the missing entries? It turns out that such task is achievable using nuclear norm minimization. In particular, the discussed algorithm provides a unique solution to recovering the matrix with high probability. In fact, when dealing with low rank matrices, we do not need a substantial amount of sampling entries. In my presentation, I will discuss how nuclear norm minimization can solve the matrix completion problem, the conditions that matrices need to satisfy, and the numerical experiment to test its performance.

**Time: 4:20-4:40 pmTitle: Control Theory and Applications**Speaker: Fred Cunningham

Comps Presentation Video

Abstract: Differential equations and classical mechanics tell you how to model dynamic systems and make predictions about their future behavior. However, no class at Carleton takes the step beyond prediction into control. The field of control theory focuses on extending our representation of dynamics to understand if and how these dynamic systems may be controlled. In this talk, the basic ideas of control theory are introduced to give the background needed to solve a classic example in control theory using Mathematica.

**Time: 4:40-5:00 pmTitle: Understanding Asset Pricing with Convex Geometry**Speaker: Yilun Liu

Comps Presentation Video

Abstract: Separation theorems for convex sets are a basic mathematical tool that find enormously widespread use throughout economics. Here, I would explore the close connection between the economics and the mathematics of convexity — by being based on the language and ideas of economics directly. Specifically, I would investigate how the separating hyperplane theorem finds its support on economic convexity, and how this mathematical theorem supports the fundamental theorem of asset pricing. The notion of arbitrage and its prerequisites lies within the marvelous world of convex geometry.

**Time: 5:00-5:20 pmTitle: Hilbert Spaces and the Riesz Representation Theorem**Speaker: Ruofei Li

Comps Presentation Video

Abstract: R^n is a vector space. But have you considered about how the set of all continuous complex-valued function is also a vector space? Then, what would a vector in such a space look like? How will the space behave? In this talk, we will generalize the tools we used in our preliminary studies of spaces to broader situations. We will learn about characteristics of spaces and get to know what a Hilbert Space refers to. Finally, we will sketch a proof for the Riesz Representation Theorem, which is a powerful tool for studying functionals.

**Time: 5:20-5:40 pmTitle: The Lineup Protocol: Using Simulation to Improve Visual Diagnostics**Speaker: Jack Moran

Comps Presentation Video

Abstract: What does a “good” diagnostic plot look like? When we look at a residual or QQ-plot, we are implicitly comparing it to what we view as a “good” plot in our mind. It turns out that we as humans are exceptional at picking patterns out of randomness and hence tend to over-interpret visual diagnostics. We attempt to remedy this problem by simulating “good” diagnostic plots for a viewer to use as references when they evaluate a visual display. In this talk, we look at a variety of simulation-based diagnostics in the context of binary logistic regression (regression where the response variable is a success/failure). We explore how to simulate data according to a fitted model and discuss how simulation can help us detect model violations. In particular, we look at the results of a simulation study for a specific type of simulation-based diagnostic, the lineup protocol.

**Time: 5:40-6:00 pmTitle: The 3x+1 Problem**Speaker: Yuting Su

Comps Presentation Video

Abstract: Pick a positive integer, x. If x is odd, we make x = 3x+1; if x is even, we make x = x/2. If we repeat this step 10 times, will the new x be larger or smaller than the original x? What if we repeat the step 100 times, 1000 times, and even more? Although the calculation of each new “x” is fairly easy, any 4th grader can do it, it is very difficult to predict the trajectory of x. So what are the approaches we can take to solve this question? In this presentation, we will explore the 3x+1 problem, also known as the Collatz Conjecture, through the lenses of algebra, graphs, probability, and parity vectors in order to produce some interesting results about this topic.

**Time: 6:00-6:20 pmTitle: Stommel’s Box Model for Thermohaline Circulation**Speaker: Frank Liu

Comps Presentation Video

Abstract: What do we mean by “climate”? How do we devise the role of mathematics in discussing climate topics? What perspectives and angles are we using to model the climate system? In this presentation, the Stommel’s Box Model applies a system-level approach to simulate the Thermohaline Circulation. Through a moderate level of reduction and simplification, I believe this model is still able to retain the essential attributes of the physical world and successfully reproduce the complex phenomena. In this presentation, I am going to discuss the origin, significance, and angle of Stommel’s Box Model for Thermohaline Circulation.

## Friday, March 5, 2021

**Time: 4:00-4:20 pmTitle: Methods For Evaluating Pitcher Performance**Speaker: Blake Anderson

Comps Presentation Video

Abstract: Many statistics fail to isolate a pitcher’s performance. Is pitching performance in baseball easily quantifiable? Why are traditional statistics flawed? What are the best statistics to analyze pitchers? In my talk, I will answer these questions and many more, such as why K/9 and HR/9 have been increasing since 1945 and why the best pitchers in MLB don’t always win the Cy Young Award. In particular, I will explain and give examples of the different types of pitching statistics and break down how the best-advanced statistics are calculated. I will go through some of my analysis in R using familiar statistical packages on relevant batted ball data pulled from Baseball Savant. I will discuss my analysis of line-drive pitches using ggplots, GAMs, and clustering, but most of my time was spent analyzing the performance of my favorite pitcher. I was especially interested in discovering why he was excellent during the 2018 season and league average during the 2019 season. I used the most relevant batted ball data and advanced statistics to form my data-driven conclusions.

**Time: 4:40-5:00 pmTitle: Between You & Me: Information and Coding Theory**Speaker: Kate Sweeney

Comps Presentation Video

Abstract: Information theory and coding theory are the backbone of how we think about information in modern computing. Taking off in earnest in the 1940s with the dawn of Information Age, the underpinnings of everything from the bit to compression to detecting errors comes back to these fields. A bit is 1 or 0 in a binary string, but what does it really mean? Compression cuts down your file size, but how (and just how small it can get)? Our data travels from point A to point B with the potential for corruption or loss of data at every stage. When you can’t count on the binary string you received to be the same as the one I sent, how can we transmit information between you and me? Information theory and coding theory seek to answer all those questions and more.

**Time: 5:20-5:40 pmTitle: Modelling Extreme Values in Property Loss Insurance Records**Speaker: Yiwen Luo

Comps Presentation Video

Abstract: How do we model fat-tailed distribution to find out how large or how small your data will probably get? Probability distributions like the normal distribution will likely underestimate the tail size of distributions with heavy tails. In this presentation, I will first introduce the Generalized Extreme Value (GEV) distribution and Generalized Pareto Distribution (GPD) and their parameter estimation process. I will then introduce specialized approaches for estimating extreme values such as the Block Maxima Approach and the Peaks Over Threshold approach which utilize GEV and GPD to model the extreme values in the data. We will apply these methods to an example of fire loss insurance records and use Value at Risk to evaluate possible maximum loss in the future.

**Time: 5:40-6:00 pmTitle: Brownian Motion and its Applications**Speaker: ZhaoBin Li

Comps Presentation Video

Abstract: Named after Robert Brown and explained by Albert Einstein, Brownian motion describes the random movement of particles. The talk will define Brownian motion as a stochastic process through visual illustrations and simulations. Then, using the metaphor of a drunk tap dancer, we will converge random walk to Brownian motion by taking the limit. Finally, we will briefly explain why economists like to use geometric Brownian motion to generate stock prices and why Brownian motion breaks down drastically at modelling GameStop.