Buomsoo Kim

Deep learning state of the art 2020 (MIT Deep Learning Series) - Part 3

|

This is the third and last part of Lex Fridman’s Deep learning state of the art 2020 talk. In this posting, let’s review the remaining part of his talk, starting with Government, Politics, and Policy.

Government, Politics, and Policy

AI in Political Discourse - Andrew “Yang”

First presedential candidate to discuss AI extensively as part of his platform

  • Department: new executive department (the Department of Technology)
  • Focus on AI
  • Companies: create a public-private partnership

American AI Initiative

In Feb 2019, the president signed Executive Order 13859 announcing the American AI Initiative

  • Investment in long-term research
  • Support research in academia and industry
  • Access to federal data
  • Promote STEM education
  • Develop AI in “a manner consistent with our Nation’s values, policies, and priorities”
  • AI must also be developed in a way that does not compromise our American values, civil liberties, or freedoms.

Ethics of recommender systems

Most of the recommender systems used by large tech companies such as FAANG use DL. There should be some effort to inform public/government about the details in the system.

Play Store App discovery (DeepMind + Google)

Hopes for 2020

Less fear of AI

More balanced, informed discussion on the impact of AI in society

Experts

Continued conversations by government officials about AI, privacy, cybersecurity with experts in academia and industry

Recommender system transparency

More open discussion and publication behind recommender systems used in industry

Courses, Tutorials & Books

If you are interested, please refer to my curation on data science study materials for a more comprehensive list of courses, tutorials, and books!

Online DL courses

Deep Learning

  • Fast.ai: Practical deep learning for coders
  • Stanford CS231n: CNN for visual recognition
  • Stanford CS224n: NLP with DL
  • Deeplearning.ai (coursera): Deep Learning (Andrew Ng)

Reinforcement Learning

  • David Silver: Intro to RL
  • OpenAI: Spinning Up in Deep RL

Tutorials

Over 200 of the best ML, NLP, and Python tutorials (by Robbie Allen)

Deep learning books

  • Deep Learning with Python (by F. Chollet)
  • grokking Deep Learning (by A. W. Trask)
  • Deep Learning (by I. Goodfellow)

General hopes for 2020

Summary & key points

  • Reasoning
  • Active learning and life-long learning
  • Multi-modal and multi-task learning
  • Open-domain conversations
  • Applications: medical, autonomous vehicles
  • Algorithmic ethics
  • Robotics
  • Recommender systems

Recipe for progress in AI

Again, it is a great talk and I really enjoyed watching the video and summarizing it. So if you are interested, please check out the Youtube link to the video! Even though you don’t watch the whole video and just focus on the part that you are interested in, I bet you can benefit from it greatly.

Deep learning state of the art 2020 (MIT Deep Learning Series) - Part 2

|

In the previous posting, we have reviewed Part 1 of Deep learning state of the art 2020 talk by Lex Fridman. In this posting, let’s review the remaining part of his talk, starting with reinforcement learning.

Deep reinforcement learning and self-play

OpenAI & Dota2

April 2019, OpenAI Five beats OG team, the 2018 world champion.

  • Trained 8 time longer compared to the 2018 version
  • Experienced about 45,000 yrs of self-play over 10 realtime months
  • 2019 version has a 99.9% win rate vs. 2018 version

DeepMind & Quake 3 Arena Capture the Flag

Use self-play to solve the multi-agent game problem

“Billions of people inhabit the planet, each with their own individual goals and actions, but still capable of coming together through teams, organisations and societies in impressive displays of collective intelligence. This is a setting we call multi-agent learning: many individual agents must act independently, yet learn to interact and cooperate with other agetns. This is an immensely difficult problem - because with co-adapting agents the world is constantly changing.”

DeepMind AlphaStar

  • Dec 2018, AlphaStar beats MaNa, one of the world’s strongest professional players, 5-0
  • Oct 2019, AlphaStar reaches Grandmaster level playing the game under professionally prroved conditions (for humans)

“AlphaStar is an intriguing and unorthodox player - one with the reflexes and speed of the best pros but strategies and a style that are entirely its own. The way AlphaStar was trained, with agents competing aginst each other in a league, has resulted in gameplay that’s unimaginably unusual; it really makes you question how much of StarCraft’s diverse possibilities pro players have really explored.” - Kelazhur, professional StarCraft 2 player

Pluribus - Texas Hold’em Poker

Pluribus won in six-player no-limit Texas Hold’em Poker

  • Imperfect information
  • Multi-agent

Offline

Self-play to generate coarse-grained “blueprint” strategy

Online

Use search to improve blueprint strategy based on particular situation

“Its major strength is its ability to use mixed strategies. That’s the same thing that humans try to do. It’s a matter of execution for humans - to do this in a perfectly random way and to do so consistently. Most people just can’t” - Darren Elias, professional Poker player

OpenAI Rubik’s Cube Manipulation

  • Automatic Domain Randomization (ADR): generate progressively more difficult environment as the system learns (alternative for self-play)
  • “Emergent meta-learning”: the capacity of the environment is unlimited, while the network is constrained

Hopes for 2020

Robotis

Use of RL methods in manipulation and real-world interaction tasks

Human behavior

Use of multi-agent self-play to explore naturally emerging social behaviors as a way to study equivalent multi-human systems

Games

Use RL to assist human experts in discovering new strategies at games and other tasks in simulation

Science of Deep Learning

The Lottery Ticket Hypothesis

For every network, there is a subnetwork that can achieve a same level of accuracy after training. There exist architectures that are much more efficient!

“Based on these results, we articulate the lottery ticket hypothesis: dense, randomly-initialized, feed-forward networks contain subnetworks (winning tickets) that — when trained in isolation — reach test accuracy comparable to the original network in a similar number of iterations.” - Frankle and Carbin (2019)

Disentanglaed Representations

Unsupervised learning of disentagled representations without inductive biases is impossible. So inductive biases (assumptions) should be made explicit.

” Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision, investigate concrete benefits of enforcing disentanglement of the learned representations, and consider a reproducible experimental setup covering several data sets.” - Locatello et al. (2019)

Deep Double Descent

“We show that a variety of modern deep learning tasks exhibit a “double-descent” phenomenon where, as we increase model size, performance first gets worse and then gets better. Moreover, we show that double descent occurs not just as a function of model size, but also as a function of the number of training epochs. We unify the above phenomena by defining a new complexity measure we call”

Hopes for 2020

Fundamentals

Exploring fundamentals of model selection, train dynamics, and representation characteristics with respect to architecture characteristics.

Graph neural networks

Exploring use of graph neural networks for combinatorial optimization, recommender systems, etc.

Bayesian deep learning

Exploring Bayesian neural networks for estimating uncertainty and online/incremental learning

Autonomous Vehicles and AI-assited driving

Waymo

Level-4 autonomous vehicles - machine is responsible

  • On-road: 20M miles
  • Simulation: 10B miles
  • Testing & validation: 20,000 classes of structured tests
  • Initiated testing without a safety driver

Tesla Autopilot

Level-2 autonomous vehicles - human is responsible

  • Over 900k vehicle deliveries
  • Currently about 2.2B estiamted Autopilot miles
  • Projected Autopilot miles of 4.1B by 2021

Active learning & multi-task learning

  • Collaborative Deep Learning (aka Software 2.0 Engineering)

Role of human experts - train the neural network and identify edge cases that can maximize the improvement

Vision vs. Lidar (Level 2 vs. Level 4)

Level 2 - Vision sensors + DL

  1. Pros
    • Highest resolution information
    • Feasible to collect data at scale and learn
    • Roads are designed for human eyes
    • Cheap
  2. Cons
    • Needs a huge amount of data to be accurate
    • Less explainable
    • Driver must remain vigilant

Level 3 - Lidar + Maps

  1. Pros
    • Explainable, consistent
    • Accurate with less data
  2. Cons
    • Less amenable to ML
    • Expensive (for now)
    • Safety driver or teleoperation fallback

Hopes for 2020

Applied deep learning innovation

Life-long learning, active learning, multi-task learning

Over-the-air updates

More level 2 systems begin both data collection and over-the-air software updates

Public datasets of edge-cases

More publicly available datasets of challenging cases

Simulators

Improvement of publicly available simulators (CARLA, NVIDIA DRIVE Constellation, Voyage Deepdrive)

Less hype

More balanced in-depth reporting (by journalists and companies) on successes and challenges of autonomous vehicle development.

So far, we looked into deep reinforcement learning, science of deep learning, and autonomous driving parts of the talk. In the next posting, let’s examine the remaining part of the talk.

Deep learning state of the art 2020 (MIT Deep Learning Series) - Part 1

|

This is one of talks in MIT deep learning series by Lex Fridman on state of the art developments in deep learning. In this talk, Fridman covers achievements in various application fields of deep learning (DL), from NLP to recommender systems. It is a very informative talk encompassing diverse facets of DL, not just technicalities but also issues regarding people, education, business, policy, and ethics. I encourage anyone interested in DL to watch the video if time avails. For those who do not have enough time or want to review the contents, I summarized the contents in this posting and provided hyperlinks to additional materials. Since it is a fairly long talk with a great amount of information, this posting will be about the first part of the talk, until the natural language processing (NLP) part.

About the speaker

Lex Fridman is AI researcher having primary interests in human-computer interaction, autonomous vehicles, and robotics at MIT. He also hosts podcasts with leading researchers and practitioners in information technology such as Elon Musk and Andrew Ng.

Below is the summarization of his talk.

AI in the context of human history

The dream of AI

“AI began with an ancient wish to forge the gods” - Pamela McCorduck, Machines Who Think (1979)

DL & AI in context of human history

Dreams, mathematical foundations, and engineering in reality

“It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers. They would be able to converse with each other to sharpen their wits. At some stage therefore, we should have to expect the machines to take control” - Alan Turing, 1951

  • Frank Rosenblatt, Perceptron (1957, 1962)
  • Kasparov vs. Deep Blue (1997)
  • Lee vs. Alphago (2016)
  • Robots and autonomous vehicles

History of DL ideas and milestones

  • 1943: Neural networks (Pitts and McCulloch)
  • 1957-62: Perceptrons (Rosenblatt)
  • 1970-86: Backpropagation, RBM, RNN (Linnainmaa)
  • 1979-98: CNN, MNIST, LSTM, Bidirectional RNN (Fukushima, Hopfield)
  • 2006: “Deep learning”, DBN
  • 2009: ImageNet + AlexNet
  • 2014: GANs
  • 2016-17: AlphaGo, AlphaZero
  • 2017-19: Transformers

Deep learning celebrations, growth, and limitations

Turing award for DL

  • Yann LeCun, Geoff Hinton, and Yoshua Bengio wins Turing award (2018)

“The conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing”

Early figures in DL

  • 1943: Walter Pitts & Warren McCulloch (computational model for neural nets)
  • 1957, 1962: Frank Rosenblatt (perceptron with single- & multi-layer)
  • 1965: Alexey Ivakhnenko & V. G. Lapa (learning algorithm for MLP)
  • 1970: Seppo Linnainmmaa (backpropagation and automatic differentiation)
  • 1979: Kunihiko Fukushima (convolutional neural nets)
  • 1982: John Hopfield (Hopfield networks, i.e., recurrent neural nets)

People of DL & AI

Lex’s hope for the community

  • More respect, open-mindedness, collaboration, credit sharing
  • Less derision, jealousy, stubbornness, academic silos

Limitations of DL

DL criticism

  • In 2019, it became cool to say that DL has limitations
  • “By 2020, the popular press starts having stories that the era of Deep Learning is over” (Rodney Brooks)

Growth in DL community

Hopes for 2020

Less hype & less anti-hype

Hybrid research

Research topics

  • Reasoning
  • Active learning & life-long learning
  • Multi-modal & multi-task learning
  • Open-domain conversation
  • Applications: medical, autonomous vehicles
  • Algorithmic ethics
  • Robotics

DL and deep reinforcement learning frameworks

DL frameworks

Tensorflow (2.0)

  • Eager execution by default
  • Keras integration
  • TensorFlow.js, Tensorflow Lite, TensorFlow Serving, …

PyTorch (1.3)

  • TorchScript (graph representation)
  • Quantization
  • PyTorch Mobile
  • TPU support

RL frameworks

  • Tensorflow: OpenAI Baselines (Stable Baselines), TensorForce, Dopamine (Google), TF-Agents, TRFL, RLLib (+Tune), Coach
  • Pytorch: Horizon, SLM-lab
  • Misc: RLgraph, Keras-RL

Hopes for 2020

Framework-agnostic research

Make easier to translate from PYtorch to TensorFlow and vice versa

Mature deep RL frameworks

Converge to fewer, actively-developed, stable RL frameworks less tied to TF or PyTorch

Abstractions

Build higher abstractions, e.g., Keras, fastai, to empower people outside the ML community

Natural Language Processing

Transformer

BERT

State of the art performances in various NLP tasks, e.g., sentence classification and question answering

Transformer-based language models (2019)

Alexa Prize and open domain conversations

Amazon open-sourced the topical-chat dataset, inviting researchers to participate in the Alexa Prize Challenge

Lessons learned

Developments in other NLP tasks

Seq2Seq

Multi-domain dialogue

Common-sense reasoning

Hopes for 2020

Reasoning

Combining (commonsense) reasoning with language models

context

Extending language model context to thousands of words

Dialogue

More focus on open-domain dialogue

Video

Ideas and successes in self-supervised learning in visual data

So far, this is the summarization of the talk up to the NLP part. In the next few postings, I will be distilling and summarizing information from the remaining part of the talk.

The Craft of Writing Effectively (UChicago Leadership Lab)

|

Writing is one of the most salient skills in professional life of individuals, especially for people in academia. In his talk on speaking, Professor Patrick Wilson mentioned that the ability to properly write is the second most important success factor in life (Of course, the most important one he mentioned is speaking). In this inspiring talk, Larry McEnerney, Director of the University of Chicago’s Writing Program, delineates the Craft of Writing Effectively in a provocative, yet engaging manner.

  • YouTube Link to the lecture video
  • PDF file of the handout used in the lecture (provided for the Ohio State University)

About the speaker

Lawrence McEnerney is the Director of Writing Programs at the University of Chicago. Besides teaching students how to write and speak, he serves as Resident Master and consulted numerous clients such as universities, institues, and businesses on effective communication.

Below is the summarization of his talk on the craft of writing.

Think about readers when writing

This is the gist of this talk. Do not think of rules when writing. Think about readers if you are an expert caring about the value of your writing.

Orthogonal processes of reading and writing

Horizontal process of writing interferes vertical process of reading

  • Horizontal process of writing: In most cases, writers use the writing process to help themselves think and make sense of the world.
  • Vertical process of reading: Readers use the text to change the way they think about the world.

As a result, the writing process interferes with the reading process of readers.

When readers are interfered,

  • They slow down or re-read.

  • Then, they misunderstand.

  • Then, they are aggravated.

  • Then, they are done with Reading.

Teachers do not stop reading because they are paid to care about you

They are paid to read and grade your text, not to change the way they see the world.

Nevertheless, in the world beyond school, people are not paid to care about you. They read because the material has VALUE to them.

Your writing has to be VALUABLE

Common desiderata of professional text

  • Clear
  • Organized
  • Persuasive
  • VALUABLE

Value is in the eye of the reader

  • Value does not lie in the “world.” It lies in the mind of readers.

Don’t try to make readers understand, change their ideas

  • When people do not recognize the importance of work, many try to explain. But DO NOT EXPLAIN.

Explaining is “revealing to the world what is inside your head. No one cares inside of your head!”

  • Writing is not about communicating your ideas to readers. It is about CHANGING THEIR IDEAS.

When you have to explain, you explain inside the principles of VALUABLE and PURSUASIVE.

How to make your writing valuable

Contrasting view on knowledge

Positivistic view of knowledge

  • The more, the better. The newer, the better

Alternate view of knowledge

  • People interact and reach a consensus on what knowledge is (and isn’t)
  • You have to deal with what we say what knowledge is (and isn’t).

New and Original are not necessarily knowledge

  • What more matters is who cares?

Learning the CODE

Identify people with power in the community, and give them what they want*

  • Evey community has its own “CODE” that is shared across its members.

  • Persuasion depends on what readers doubt. You have to know about the readers, i.e., the CODE.

  • To get a paper published, you have to criticize previous work in accordance with the CODE.

“You are great. You advanced our community in fabulous ways. But… (argument)”

  • Examples of vocabularies indicating community/code: widely, accepted, reported

  • Spend 15 mins week, take articles in your field and print them out. Circle every word in the article that’s making value.

Nuts and bolts of VALUABLE writing

INSTABILITY

Words that indicate tension, challenge, contradiction, redflag

  • anomaly
  • inconsistent
  • but
  • however
  • although

Writing the introduction part

The positivistic approach

  • Background/definition: stability, consistency, continuity, …
  • Thesis

The VALUE approach

  1. PROBLEM: for a specific set of readers
    • Instability: but, however, although, inconsistent, anomaly, …
    • Use graphics (e.g., charts) to emphsize a problem

Again, in doing so, follow the CODE!

  1. SOLUTION
    • Cost/benefit: instability causes costs on readers OR instability, if solved, offers benefit to them

Literature review

Literature review for the teacher

  • The whole purpose is to make sure that the student perfectly understands the topic

“In 2001 he said this, in 2002 he said this, and in 2005 he said this, …”

Lit review in a professional text

  • The main purpose is to enrich the PROBLEM. Again, emphasize instability.

“In 2001 he said this, but in 2004, if we are smart, we realize … and in 2005 he said this, which complicates the situation. The situation is more complicated when considering previous discussions…”

  • Usually, more background means more problem, not more “lit review”

Precaution of emphasizing “GAP” in knowledge

  • A gap assumes another model of knowledge. It assums knowledge is bounded, like a puzzle.

  • If knowledge is unbounded, filling a single gap is meaningless.

How to speak by Patrick Wilson (MIT OpenCourseWare)

|

This is a fantastic lecture by Professor Patrick Wilson at MIT on how to speak. Communication skills are arguably one of the most critical facets in life. Especially, how to properly speak in front of others is extremely important for professional success - job interviews, conference presentations, project meetings, and so on. In this video, Professor Wilson clearly outlines what are the fundamental building blocks of good speech and how to put them together to clearly communicate with the audience.

About the speaker

Professor Patrick Wilson was a computer scientist at the Massachussettes Institute of Technology (MIT). He was director of the MIT Artificial Intelligence Lab from 1972-1997. His doctoral advisor was Marvin Minsky, one of the fathers of modern AI.

Below is the summarization of his talk on the art of speaking.

Success factors in life

Success in one’s life will be determined by below three factors, ordered by importance.

  1. Ability to speak
  2. Ability to write
  3. Quality of ideas

Quality of your speech

Quality of your speech is contingent upon three major factors.

  • $K$: knowledge
  • $P$: practice
  • $T$: inherent talent (very small)

How to start

No Jokes

Starting with jokes is not a good idea because the audience is not ready for a joke.

Empowerment Promise

Instead, tell people what they are going to learn that they didn’t know at the beginning of the hour.

Samples for the speech “armamentarium”

Cycle

Repeatedly emphasize your idea (three times) so that people can remember

Build fence

Build a fence around your idea so that the audience doesn’t confuse it with somebody else’s idea

Verbal punctuation

Provide landmarks where you let the audience to know that it is a good time to “get back on” to the talk

Questions

Ask a question (and wait 7 seconds for an answer) that is neither too hard nor easy.

Time & Place

Ideally, speak at 11 AM, where it is well lit, cased, and reasonably populated.

Tools

  • “Empathetic mirroring”: movements in the physical world are mirrored in the brain.

Boards

Boards are useful when informing, teaching, and lecturing.

  • Graphics
  • Speed
  • Target: where to place the hands. Putting hands in your back or pockets can be inappropriate.

Props

Helps the audience remember the details.

Slides

Slides are good when your purpose is exposing, e.g., job talks and conference presentations.

  • Common problem: too many slides, too many words

Use slides as condiments, not the main part of talk. Do not use small fonts.

  • Important rules
  1. Do not read
  2. Be in the image
  3. Keep images simple
  4. Eliminate clutter

  • Common crimes
  1. Small font size

  1. Laser pointers: not recommended

The speaker gets no eye contact, no engagement with the audience.

Rather, put a little arrow on the slide to point out something.

  1. The too heavy crime

With 3 or 4 slides having text, let the audience read them with time.

Informing

Promise

Provide promise upfront with something that the audience can learn from the talk.

Inspiration

Be someone who show passion in what you are doing.

How to think

Teach people how to think

Persuading

Job talks

Someone who is familiar with the talk is not good judge of a good presentation.

  • Vision: problem & approach
  • Done something: steps that need to solve the problem

You have only five minutes to show the two!

  • Contributions

Getting famous: being recognized for what you did

  • Why care? You get used to being famous, but do not get used to not being recognized for what you have done

  • How to get recognized

  1. Symbol: arch
  2. Slogan: one-shot learning
  3. Surprise
  4. Salient idea: near miss
  5. Story” how you did it, how it works, and why it is important

How to stop

  • Final slides

Uncool ways to end a presentation

Finish while summarizing the key contributions

  • Final words
  1. Telling a joke is okay
  2. Do not thank the audience. It is a weak move.
  3. Salute the audience: tell how much you valued your time with the audience