deep-learning

Deep Learning with Python, Second Edition by François Chollet - Free PDF

A Gentle Introduction to Graph Neural Networks

What components are needed for building learning algorithms that leverage the structure and properties of graphs?

ATDCO topology diffcalc optimization book 📄

BERT

BERT is an open source machine learning framework for natural language processing (NLP) that helps computers understand ambiguous language by using context

Why it’s so hard to use AI to diagnose cancer

The latest effort, from the Mayo Clinic, holds some clues.

The 2025 AI Engineering Reading List

We picked 50 paper/models/blogs across 10 fields in AI Eng: LLMs, Benchmarks, Prompting, RAG, Agents, CodeGen, Vision, Voice, Diffusion, Finetuning. If you're starting from scratch, start here.

5 AI Image Generators You Can Use Now

AI art generation has entered an era of evolution over revolution

Lessons from Optics, The Other Deep Learning

Musings on systems, information, learning, and optimization.

Choosing and Implementing Hugging Face Models

Pulling pre-trained models out of the box for your use case

10 GitHub Repositories for Advanced Machine Learning Projects

Where can you find projects dealing with advanced ML topics? GitHub is a perfect source with its many repositories. I’ve selected ten to talk about in this article.

Summary of Ilya Sutskevers AI Reading List · Tensor Labbet

Aman's AI Journal • Primers • Ilya Sutskever's Top 30

Aman's AI Journal | Course notes and learning material for Artificial Intelligence and Deep Learning Stanford classes.

5 Free Books on Computer Vision - MachineLearningMastery.com

[caption align=

AI & ML Projects with Python

In this article, I'll take you through a list of guided projects to master AI & ML with Python. AI & ML Projects with Python.

Carl-McBride-Ellis/Compendium-of-free-ML-reading-resources

Compendium of free ML reading resources.

Dario Amodei — Machines of Loving Grace

How AI Could Transform the World for the Better

Top Books on Deep Learning and Neural Networks

Deep learning is crucial in today's age as it powers advancements in artificial intelligence, enabling applications like image and speech recognition, language translation, and autonomous vehicles. Understanding deep learning equips individuals to harness its potential, driving innovation and solving complex problems across various industries. This article lists the top Deep Learning and Neural Networks books to help individuals gain proficiency in this vital field and contribute to its ongoing advancements and applications. Deep Learning (Adaptive Computation and Machine Learning series) This book covers a wide range of deep learning topics along with their mathematical and conceptual background. Additionally, it offers

Kolmogorov-Arnold Networks (KANs): A New Era of Interpretability and Accura

Multi-layer perceptrons (MLPs), or fully-connected feedforward neural networks, are fundamental in deep learning, serving as default models for approximating nonlinear functions. Despite their importance affirmed by the universal approximation theorem, they possess drawbacks. In applications like transformers, MLPs often monopolize parameters and lack interpretability compared to attention layers. While exploring alternatives, such as the Kolmogorov-Arnold representation theorem, research has primarily focused on traditional depth-2 width-(2n+1) architectures, neglecting modern training techniques like backpropagation. Thus, while MLPs remain crucial, there's ongoing exploration for more effective nonlinear regressors in neural network design. MIT, Caltech, Northeastern researchers, and the NSF Institute for AI and

Hai ai index report 2024 📄

Deep Learning Architectures From CNN, RNN, GAN, and Transformers To Encoder

Deep learning architectures have revolutionized the field of artificial intelligence, offering innovative solutions for complex problems across various domains, including computer vision, natural language processing, speech recognition, and generative models. This article explores some of the most influential deep learning architectures: Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), Transformers, and Encoder-Decoder architectures, highlighting their unique features, applications, and how they compare against each other. Convolutional Neural Networks (CNNs) CNNs are specialized deep neural networks for processing data with a grid-like topology, such as images. A CNN automatically detects the important features without any human supervision.

Mamba Explained

Is Attention all you need? Mamba, a novel AI model based on State Space Models (SSMs), emerges as a formidable alternative to the widely used Transformer models, addressing their inefficiency in processing long sequences.

[2304.05055] A Comprehensive Survey on Deep Graph Representation Learning

Graph representation learning aims to effectively encode high-dimensional sparse graph-structured data into low-dimensional dense vectors, which is a fundamental task that has been widely studied...

Conformal_Prediction/paper/Conformal_Prediction_final.pdf at main · MichelL 📄

This projects contains different conformal methods and approaches. Includes code generated for a experimental evaluation of a multidimensional, low-sample size biomedical dataset of oncological sub...

The Math behind Adam Optimizer | by Cristian Leo | in Towards Data Science

Why is Adam the most popular optimizer in Deep Learning? Let's understand it by diving into...

Thinking about High-Quality Human Data | Lil'Log

[Special thank you to Ian Kivlichan for many useful pointers (E.g. the 100+ year old Nature paper “Vox populi”) and nice feedback. 🙏 ] High-quality data is the fuel for modern data deep learning model training. Most of the task-specific labeled data comes from human annotation, such as classification task or RLHF labeling (which can be constructed as classification format) for LLM alignment training. Lots of ML techniques in the post can help with data quality, but fundamentally human data collection involves attention to details and careful execution.

Give AI curiosity, and it will watch TV forever

TV is just as interesting to AI as it is to humans.

Meet neograd: A Deep Learning Framework Created from Scratch Using Python a

Understanding how convolutional neural networks (CNNs) operate is essential in deep learning. However, implementing these networks, especially convolutions and gradient calculations, can be challenging. Many popular frameworks like TensorFlow and PyTorch exist, but their complex codebases make it difficult for newcomers to grasp the inner workings. Meet neograd, a newly released deep learning framework developed from scratch using Python and NumPy. This framework aims to simplify the understanding of core concepts in deep learning, such as automatic differentiation, by providing a more intuitive and readable codebase. It addresses the complexity barrier often associated with existing frameworks, making it easier for

Understanding and Coding Self-Attention, Multi-Head Attention, Cross-Attent

This article will teach you about self-attention mechanisms used in transformer architectures and large language models (LLMs) such as GPT-4 and Llama.

Understanding Deep Learning

Variational Autoencoder (VAE) with Discrete Distribution using Gumbel Softm

Theory and PyTorch Implementation

Calculate Computational Efficiency of Deep Learning Models with FLOPs and M

In this article we will learn about its definition, differences and how to calculate FLOPs and MACs using Python packages.

ELI5: FlashAttention

Step by step explanation of how one of the most important MLSys breakthroughs work — in gory detail.

Why You Need To Know About Autonomous AI Agents

A beginner's guide to understanding autonomous AI agents and their importance.

The Case for Running AI on CPUs Isn’t Dead Yet

GPUs may dominate, but CPUs could be perfect for smaller AI models

Photonic Chips Curb AI Training’s Energy Appetite

Stanford team achieves first-ever optical backpropagation milestone

A Survey of Large Language Models

Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and...

New Ebook: A Beginner’s Guide to Large Language Models

Explore what LLMs are, how they work, and gain insights into real-world examples, use cases, and best practices.

The Sequence Chat: Salesforce Research's Junnan Li on Multimodal Generative AI

One of the creators of the famous BLIP-2 model shares his insights about the current state of multimodal generative AI.

📝 Guest Post: Caching LLM Queries for Improved Performance and Cost Savings*

If you're looking for a way to improve the performance of your large language model (LLM) application while reducing costs, consider utilizing a semantic cache to store LLM responses.

How to use Midjourney to generate AI images | Digital Trends

AI-generated images have never been more popular, and Midjourney is one of the best tools. Here's how to access the AI and what to know about using it.

Google’s TPU v4 Architecture: 3 Major Features

A new technical paper titled “TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings” was published by researchers at Google. Abstract: “In response to innovations in machine learning (ML) models, production workloads changed radically and rapidly. TPU v4 is the fifth Google domain specific architecture (DSA) and its third supercomputer... » read more

Introducing Segment Anything: Working toward the first foundation model for

Why AI Inference Will Remain Largely On The CPU

Sponsored Feature: Training an AI model takes an enormous amount of compute capacity coupled with high bandwidth memory. Because the model training can be

Hands-on Generative AI with GANs using Python: Autoencoders

Start with Autoencoders to better understand GANs

Hacker News

I explain what is so unique about the RWKV language model.

Cerebras open sources seven GPT-based LLMs, ranging from 111M to 13B parame

Mike Wheatley / SiliconANGLE: Cerebras open sources seven GPT-based LLMs, ranging from 111M to 13B parameters and trained using its Andromeda supercomputer for AI, on GitHub and Hugging Face

Must read: the 100 most cited AI papers in 2022

Who Is publishing the most Impactful AI research right now? With the breakneck pace of innovation in AI, it is crucial to pick up some signal as soon as possible. No one has the time to read everything, but these 100 papers are sure to bend the road as to where our AI technology is going. The real test of impact of R&D teams is of course how the technology appears in products, and OpenAI shook the world by releasing ChatGPT at the end of November 2022, following fast on their March 2022 paper “T

Dalai

Dead simple way to run LLaMA on your computer

Deep Learning (DL) Applications In Photomask To Wafer Semiconductor Manufac

A list of artificial intelligence used in semiconductor manufacturing tools from February 2023.

?Top ML Papers of the Week - by elvis - NLP Newsletter

The top ML Papers of the Week (Mar 6 - Mar 12)

Asynchronously Parallel Optimization Method For Sizing Analog Transistors U

A new technical paper titled “APOSTLE: Asynchronously Parallel Optimization for Sizing Analog Transistors Using DNN Learning” was published by researchers at UT Austin and Analog Devices. Abstract “Analog circuit sizing is a high-cost process in terms of the manual effort invested and the computation time spent. With rapidly developing technology and high market demand, bringing... » read more

Meta unveils a new large language model that can run on a single GPU

LLaMA-13B reportedly outperforms ChatGPT-like tech despite being 10x smaller.

Google Research, 2022 & beyond: Algorithms for efficient deep learning

Posted by Sanjiv Kumar, VP and Google Fellow, Google Research (This is Part 4 in our series of posts covering different topical areas of research a...

Hacker News

Many new Transformer architecture improvements have been proposed since my last post on “The Transformer Family” about three years ago. Here I did a big refactoring and enrichment of that 2020 post — restructure the hierarchy of sections and improve many sections with more recent papers. Version 2.0 is a superset of the old version, about twice the length. Notations Symbol Meaning $d$ The model size / hidden state dimension / positional encoding size.

Paper Review: A Deep Dive into Imagen

A critical analysis of Google’s impressive new text-to-image generation tool

AI Spits Out Exact Copies of Training Images, Real People, Logos, Researche

The regurgitation of training data exposes image diffusion models to a number of privacy and copyright risks.

Text-to-4D dynamic scene generation

Find it here, via Ryan Watkins. Further improvement is required, but the pace of current breakthroughs is remarkable.

ChatGPT is not all you need. A State of the Art Review of large...

During the last two years there has been a plethora of large generative models such as ChatGPT or Stable Diffusion that have been published. Concretely, these models are able to perform tasks such...

Hacker News

Here, I provide an in-depth analysis of GPUs for deep learning/machine learning and explain what is the best GPU for your use-case and budget.

Uncovering Anomalies with Variational Autoencoders (VAE): A Deep Dive into

An example use case of using Variational Autoencoders (VAE) to detect anomalies in all types of data

Why WGANs beat GANs: A journey from KL divergence to Wasserstein loss

Wasserstein distance helps WGANs outperform vanilla GANs and VAEs. This post explains why so using some easy math.

Why TensorFlow for Python is dying a slow death

Many developers who use Python for machine learning are now switching to PyTorch. Find out why and what the future could hold for TensorFlow.

Deep Learning Pioneer Geoffrey Hinton Publishes New Deep Learning Algorithm

Geoffrey Hinton, professor at the University of Toronto and engineering fellow at Google Brain, recently published a paper on the Forward-Forward algorithm (FF), a technique for training neural networks that uses two forward passes of data through the network, instead of backpropagation, to update the model weights.

To Build Truly Intelligent Machines, Teach Them Cause and Effect | Quanta M

Judea Pearl, a pioneering figure in artificial intelligence, argues that AI has been stuck in a decades-long rut. His prescription for progress? Teach machines to understand the question why.

lucidrains/vit-pytorch: Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch - lucidrains/vit-pytorch

ChatGPT and the Imagenet moment — Benedict Evans

The wave of enthusiasm around generative networks feels like another Imagenet moment - a step change in what ‘AI’ can do that could generalise far beyond the cool demos. What can it create, and where are the humans in the loop?

2212.03551.pdf 📄

DeepMind Created An AI Tool That Can Help Generate Rough Film and Stage Scr

Alphabet's DeepMind has built an AI tool that can help generate rough film and stage scripts Engadget's Kris Holt reports: Dramatron is a so-called "co-writing" tool that can generate character descriptions, plot points, location descriptions and dialogue. The idea is that human writers will be abl...

A Quick Start on Your Journey to Federated Learning

Adapting federated learning to your own datasets

AI Homework – Stratechery by Ben Thompson

The first obvious casualty of large language models is homework: the real training for everyone, though, and the best way to leverage AI, will be in verifying and editing information.

Beginner’s Guide to Diffusion Models

An intuitive understanding of how AI-generated art is made by Stable Diffusion, Midjourney, or DALL-E

YOLOv7: A deep dive into the current state-of-the-art for object detection

Everything you need to know to use YOLOv7 in custom training scripts

6 Reinforcement Learning Algorithms Explained

Introduction to reinforcement learning terminologies, basics, and concepts (model-free, model-based, online, offline RL)

Cerebras Reveals Andromeda, a 13.5 Million Core AI Supercomputer

The world's largest chip scales to new heights.

Interview: Why Mastering Language Is So Difficult for AI

Scientist Gary Marcus argues that “deep learning” is not the only path to true artificial intelligence.

matsui528/nanopq: Pure python implementation of product quantization for nearest neighbor search

Pure python implementation of product quantization for nearest neighbor search - matsui528/nanopq

IVFPQ HNSW for Billion-scale Similarity Search | by Peggy Chang | Towards

The best indexing approach for billion-sized vector datasets

Similarity Search with IVFPQ

Find out how the inverted file index (IVF) is implemented alongside product quantization (PQ) for a fast and efficient approximate nearest…

NSVQ: Improved Vector Quantization technique for Neural Networks Training

Efficient vector quantization for machine learning optimizations (eps. vector quantized variational autoencoders), better than straight…

If you thought text-to-image AI was unbelievable, wait until you see how it

If you thought text-to-image AI was unbelievable, wait until you see how it compresses images.

All you need to know about ‘Attention’ and ‘Transformers’ — In-depth Unders

Attention, Self-Attention, Multi-head Attention, Masked Multi-head Attention, Transformers, BERT, and GPT

All you need to know about ‘Attention’ and ‘Transformers’ — In-depth Unders

Attention, Self-Attention, Multi-head Attention, and Transformers

40,000 Recipes for Murder

Two scientists inadvertently open the Pandora’s Box of WMDs. What now?

Demystifying Object Detection and Instance Segmentation for Data Scientists - MLWhiz

this post is explaining how permutation importance works and how we can code it using ELI5

Automated reasoning at Amazon: A conversation

To mark the occasion of the eighth Federated Logic Conference (FloC), Amazon’s Byron Cook, Daniel Kröning, and Marijn Heule discussed automated reasoning’s prospects.

A Brief Introduction to Geometric Deep Learning

AI for complex data

Rethinking Thinking: How Do Attention Mechanisms Actually Work?

The brain, the mathematics, and DL — research frontiers in 2022

fastai/fastbook: The fastai book, published as Jupyter Notebooks

The fastai book, published as Jupyter Notebooks.

d2l-ai/d2l-en: Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge. - d2l-ai/d2l-en

Dive into Deep Learning — Dive into Deep Learning 0.14.4 documentation

Deep Convolutional GAN — How to Use a DCGAN to Generate Images in Python

An overview of DCGAN architecture with a step-by-step guide to building it yourself

Topological Data Analysis for Machine Learning

The ArtBench Dataset: Benchmarking Generative Models with Artworks

We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation. It comprises 60,000 images of artwork from 10...

Minerva: Solving Quantitative Reasoning Problems with Language Models

Posted by Ethan Dyer and Guy Gur-Ari, Research Scientists, Google Research, Blueshift Team Language models have demonstrated remarkable performance...

How Imagen Actually Works

Learn how Imagen generates photorealistic images given only a text description.

Generating Children's Stories Using GPT-3 and DALL·E

We used GPT-3 and DALL·E to generate a children's storybook about Ash and Pikachu vs. Team Rocket. Read the story and marvel at the AI-generated visuals!

Recipe Cuisine Classification

BERT Transformer & Food-Drug Negative Interactions

How to Build an Image-Captioning Model in Pytorch

A detailed step-by-step explanation of how to build an image-captioning model in Pytorch

Google bans deepfake-generating AI from Colab

In a recent policy change, Google has banned deepfake-generating AI projects from Colab, its platform for hosting and running arbitrary Python code.

Face to Face With Dall-E, The AI Artist That Might Change The World

Dall-E can illustrate just about anything using a short text prompt. Should it?

A Guide To Asking Robots To Design Stained Glass Windows

...

A Face Search Engine Anyone Can Use Is Alarmingly Accurate

PimEyes is a paid service that finds photos of a person from across the internet, including some the person may not want exposed. “We’re just a tool provider,” its owner said.

Sparse Autoencoder Neural Networks — How to Utilise Sparsity for Robust Inf

A comparison between Undercomplete and Sparse AE with a detailed Python example

Another Firing Among Google’s A.I. Brain Trust, and More Discord (Published 2022)

The researchers are considered a key to the company’s future. But they have had a hard time shaking infighting and controversy over a variety of issues.

The Modern Mathematics of Deep Learning

We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning...

Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrou

Posted by Sharan Narang and Aakanksha Chowdhery, Software Engineers, Google Research In recent years, large neural networks trained for language un...

Autoencoders (AE) — A Smart Way to Process Your Data Using Unsupervised Neural Networks

What is an Autoencoder, and how to build one in Python?

NVIDIA NeRF AI Renders Amazingly Realistic 3D Scenes From 2D Photos In Just

It takes a human being around 0.1 to 0.4 seconds to blink. In even less time, an AI-based inverse rendering process developed by NVIDIA can generate a 3D scene from 2D photos.

AI Virtual Assistant Technology Guide 2022

They can help you get an appointment or order a pizza, find the best ticket deals and bring your...

Autoencoders: From Vanilla to Variational

Because GANs are not all you need

A Comprehensive Benchmark of Deep Learning Libraries on Mobile DevicesA Com 📄

Machine Learning Algorithms Cheat Sheet — Accel.AI

Machine learning is a subfield of artificial intelligence (AI) and computer science that focuses on using data and algorithms to mimic the way people learn, progressively improving its accuracy. This way, Machine Learning is one of the most interesting methods in Computer Science these days, and it'

scikit-and-tensorflow-workbooks/ch14-Recurrent-NNs.ipynb at master · bjpcjp/scikit-and-tensorflow-workbooks

based on "Hands-On Machine Learning with Scikit-Learn & TensorFlow" (O'Reilly, Aurelien Geron) - bjpcjp/scikit-and-tensorflow-workbooks

scikit-and-tensorflow-workbooks/ch15-autoencoders.ipynb at master · bjpcjp/scikit-and-tensorflow-workbooks

based on "Hands-On Machine Learning with Scikit-Learn & TensorFlow" (O'Reilly, Aurelien Geron) - bjpcjp/scikit-and-tensorflow-workbooks

Deep Learning Interviews: Hundreds of fully solved job interview...

The second edition of Deep Learning Interviews is home to hundreds of fully-solved problems, from a wide range of key topics in AI. It is designed to both rehearse interview or exam specific...

Curating a Dataset from Raw Images and Videos

Best-practices to follow when building datasets from large pools of image and video data and tools that make it straightforward.

Detecting Twenty-thousand Classes using Image-level Supervision

Current object detectors are limited in vocabulary size due to the small scale of detection datasets. Image classifiers, on the other hand, reason about much larger vocabularies, as their datasets...

Ten Lessons From Three Generations Shaped Google’s TPUv4i - 2021-jouppi.pdf 📄

[1909.10140] A new coefficient of correlation

Is it possible to define a coefficient of correlation which is (a) as simple as the classical coefficients like Pearson's correlation or Spearman's correlation, and yet (b) consistently estimates...

Low-Power AI Startup Eta Compute Delivers First Commercial Chips

The firm pivoted away from riskier spiking neural networks using a new power management scheme

PyTorch vs TensorFlow in 2023

Should you use PyTorch vs TensorFlow in 2023? This guide walks through the major pros and cons of PyTorch vs TensorFlow, and how you can pick the right framework.

Machine-Learning-Tokyo/Interactive_Tools: Interactive Tools for Machine Learning, Deep Learning and Math

Interactive Tools for Machine Learning, Deep Learning and Math - Machine-Learning-Tokyo/Interactive_Tools

Under The Hood Of Google’s TPU2 Machine Learning Clusters

As we previously reported, Google unveiled its second-generation TensorFlow Processing Unit (TPU2) at Google I/O last week. Google calls this new

3D Stacking Could Boost GPU Machine Learning

Nvidia has staked its growth in the datacenter on machine learning. Over the past few years, the company has rolled out features in its GPUs aimed neural

HPC Technique Propels Deep Learning at Scale

Researchers from Baidu’s Silicon Valley AI Lab (SVAIL) have adapted a well-known HPC communication technique to boost the speed and scale of their neural network training and now they are […]

Analysis and Comparison of Performance and Power Consumption of Neural Netw

In this work, we analyze the performance of neural networks on a variety of heterogenous platforms. We strive to find the best platform in terms of raw benchmark performance, performance per watt a…

pylabel-project/pylabel: Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo.

Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo. - GitHub - pylabel-proj...

Tearing Apart Google’s TPU 3.0 AI Coprocessor

Google did its best to impress this week at its annual IO conference. While Google rolled out a bunch of benchmarks that were run on its current Cloud TPU

First In-Depth Look at Google’s TPU Architecture

Four years ago, Google started to see the real potential for deploying neural networks to support a large number of new services. During that time it was

Survey paper on Deep Learning on GPUs

The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. GPU continues to remain the most widely used accelerator for DL applications. We present a survey of architectur…

How to make your own deep learning accelerator chip!

Currently there are more than 100 companies all over the world building ASIC’s (Application specific integrated circuit) or SOC’s (System…

GPU Computing for Data Science

GPU Computing for Data Science - Download as a PDF or view online for free

Vivienne Sze · Efficient Processing of Deep Neural Network: from Algorithms to Hardware Architectures

This tutorial describes methods to enable efficient processing for deep neural networks (DNNs), which are used in many AI applications including computer vision, speech recognition, robotics, etc....

louisfb01/best_AI_papers_2021: A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.

A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code. - louisfb01/best_AI_papers_2021

https://blog.riseml.com/comparing-google-tpuv2-against-nvidia-v100-on-resnet-50-c2bbb6a51e5e

http://research.baidu.com/bringing-hpc-techniques-deep-learning/

Memory at the Core of New Deep Learning Research Chip

Over the last two years, there has been a push for novel architectures to feed the needs of machine learning and more specifically, deep neural networks.

Four Deep Learning Papers to Read in December 2021

From Sensory Substitution to Decision Transformers, Persistent Evolution Strategies and Sharpness-Aware Minimization

AI-Based Image Compression: The State of the Art

An overview of some of the leading libraries and frameworks out there

Transformers

AI-Based Image Compression: The State of the Art

An overview of some of the leading libraries and frameworks out there

MedMNIST v2 Dataset | Papers With Code

MedMNIST v2 is a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into 28 x 28 (2D) or 28 x 28 x 28 (3D) with the corresponding classification labels, so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST v2 is designed to perform classification on lightweight 2D and 3D images with various data scales (from 100 to 100,000) and diverse tasks (binary/multi-class, ordinal regression and multi-label). The resulting dataset, consisting of 708,069 2D images and 10,214 3D images in total, could support numerous research / educational purposes in biomedical image analysis, computer vision and machine learning. Description and image from: MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification Each subset keeps the same license as that of the source dataset. Please also cite the corresponding paper of source data if you use any subset of MedMNIST.

Applications and Techniques for Fast Machine Learning in Science

In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental...

An Introduction to PyTorch Lightning

PyTorch Lightning has opened many new possibilities in deep learning and machine learning with a high level interface that makes it quicker to work with PyTorch.

HRNet explained: Human Pose Estimation, Semantic Segmentation and Object De

Revealing whats behind the state-of-the art algorithm HRNet

graviraja/MLOps-Basics

Laion-400M: open-source dataset of 400M image-text pairs

GPT-4 Will Have 100 Trillion Parameters — 500x the Size of GPT-3 | by Alber

Are there any limits to large neural networks?

An Introduction to AI Story Generation

A primer on automated story generation and how it it strikes at some fundamental research questions in artificial intelligence.

Object Detection Algorithms and Libraries - neptune.ai

A guide on object detection algorithms and libraries that covers use cases, technical details, and offers a look into modern applications.

10 Computer Vision Terms Everyone Must Know About!

The ten essential computer vision terminologies that everyone should learn to become more proficient at computer vision with sample codes

Researchers Create 'Master Faces' to Bypass Facial Recognition

According to the paper, their findings imply that facial recognition systems are “extremely vulnerable.”

Papers with Code - Paper with Code Newsletter

Papers With Code highlights trending Machine Learning research and the code to implement it.

PyMOL | pymol.org

deepmind/alphafold: Open source code for AlphaFold.

Open source code for AlphaFold.

GPT-J-6B: 6B JAX-Based Transformer – Aran Komatsuzaki

Summary: We have released GPT-J-6B, 6B JAX-based (Mesh) Transformer LM (Github).GPT-J-6B performs nearly on par with 6.7B GPT-3 (or Curie) on various zero-shot down-streaming tasks.You can try out …

variational autoencoders at DuckDuckGo

DuckDuckGo. Privacy, Simplified.

The Methods Corpus | Papers With Code

2284 methods • 143838 papers with code.

Same or Different? The Question Flummoxes Neural Networks. | Quanta Magazine

For all their triumphs, AI systems can’t seem to generalize the concepts of “same” and “different.” Without that, researchers worry, the quest to create truly intelligent machines may be hopeless.

A Look at Baidu’s Industrial-Scale GPU Training Architecture

Like its U.S. counterpart, Google, Baidu has made significant investments to build robust, large-scale systems to support global advertising programs. As

Tenstorrent Wormhole Analysis – A Scale Out Architecture for Machine Learni

What Happens When Multipliers No Longer Define AI Accelerators?

Current custom AI hardware devices are built around super-efficient, high performance matrix multiplication. This category of accelerators includes the

A Guide to Genetic ‘Learning’ Algorithms for Optimization

Reconstructing Images using Reinforcement Learning and Genetic Algorithms

Introduction - Hugging Face NLP Course

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Rltheorybook ajks 📄

New Machine Learning Gems for Ruby

In August, I set out to improve the machine learning ecosystem for Ruby and wasn’t sure where it would go. Over the next 5 months, I ended up...

2106 📄

Wu Dao 2.0: A Monster of 1.75 Trillion Parameters | by Alberto Romero | Med

BAAI conference presented Wu Dao 2.0. The most powerful AI to date.

NielsRogge/Transformers-Tutorials: This repository contains demos I made with the Transformers library by HuggingFace.

This repository contains demos I made with the Transformers library by HuggingFace. - NielsRogge/Transformers-Tutorials

Sentiment Analysis — Comparing 3 Common Approaches: Naive Bayes, LSTM, and

Sentiment Analysis, or Opinion Mining, is a subfield of NLP (Natural Language Processing) that aims to extract attitudes, appraisals, opinions, and emotions from text. Inspired by the rapid migration…

Introduction to Object Detection Model Evaluation

Evaluating object detection models is not straightforward because each image can have many objects and each object can belong to different classes. This means that we need to measure if the model…

10 Must Read ML Blog Posts

A collection of high-impact machine learning blog posts.

The Unreasonable Effectiveness of Recurrent Neural Networks

Musings of a Computer Scientist.

http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/

Language Models

natural language processing, nlp, machine learning, computer science

Understanding LSTM Networks -- colah's blog

Deep Learning: Our Miraculous Year 1990-1991

In 2020-21, we celebrate that many of the basic ideas behind the Deep Learning Revolution were published three decades ago within fewer than 12 months in our "Annus Mirabilis" 1990-91

An Overview Of Deep Learning

Attention and Memory in Deep Learning and NLP – WildML

The Illustrated Transformer – Jay Alammar – Visualizing machine learning on

Discussions: Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments) Translations: Arabic, Chinese (Simplified) 1, Chinese (Simplified) 2, French 1, French 2, Italian, Japanese, Korean, Persian, Russian, Spanish 1, Spanish 2, Vietnamese Watch: MIT’s Deep Learning State of the Art lecture referencing this post Featured in courses at Stanford, Harvard, MIT, Princeton, CMU and others In the previous post, we looked at Attention – a ubiquitous method in modern deep learning models. Attention is a concept that helped improve the performance of neural machine translation applications. In this post, we will look at The Transformer – a model that uses attention to boost the speed with which these models can be trained. The Transformer outperforms the Google Neural Machine Translation model in specific tasks. The biggest benefit, however, comes from how The Transformer lends itself to parallelization. It is in fact Google Cloud’s recommendation to use The Transformer as a reference model to use their Cloud TPU offering. So let’s try to break the model apart and look at how it functions. The Transformer was proposed in the paper Attention is All You Need. A TensorFlow implementation of it is available as a part of the Tensor2Tensor package. Harvard’s NLP group created a guide annotating the paper with PyTorch implementation. In this post, we will attempt to oversimplify things a bit and introduce the concepts one by one to hopefully make it easier to understand to people without in-depth knowledge of the subject matter. 2020 Update: I’ve created a “Narrated Transformer” video which is a gentler approach to the topic: A High-Level Look Let’s begin by looking at the model as a single black box. In a machine translation application, it would take a sentence in one language, and output its translation in another.

DatasetGAN

Understanding Transformers, the machine learning model behind GPT-3

How this novel neural network architecture changes the way we analyze complex data types, and powers revolutionary models like GPT-3 and BERT.

Pytorchvideo a deep learning library for video understanding

Google details new AI accelerator chips

Google detailed TPUv4 at Google I/O 2021. They're accelerator chips that deliver high performance on AI workloads.

milvus - An open source embedding vector similarity search engine powered b

A cloud-native vector database, storage for next generation AI applications - milvus-io/milvus

How Transformers work in deep learning and NLP: an intuitive introduction | AI Summer

An intuitive understanding on Transformers and how they are used in Machine Translation. After analyzing all subcomponents one by one such as self-attention and positional encodings , we explain the principles behind the Encoder and Decoder and why Transformers work so well

Causal ML for Data Science: Deep Learning with Instrumental Variables

Combining data science and econometrics for an introduction to the DeepIV framework, including a full Python code tutorial.

Algorithm-Assisted Inventory Curation

Building the raw materials for personalization at scale

untitled - HyperRec.pdf 📄

11 Ways To Reduce AI Energy Consumption

Pushing AI to the edge requires new architectures, tools, and approaches.

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. Indeed, many high-dimensional learning tasks previously thought...

Projects

Find out about all of the projects of Meta Open Source.

Top 5 Python libraries for Computer vision

Computer vision is the field of computer science that focuses on replicating parts of the complexity...

Beginner guide to Variational Autoencoders (VAE) with PyTorch Lightning (Part 2)

This blog post is part of a mini-series that talks about the different aspects of building a PyTorch Deep Learning project using Variational Autoencoders. In Part 1, we looked at the variational…

Hopfield Networks is All You Need | hopfield-layers

Blog post

jsbroks/coco-annotator: :pencil2: Web-based image segmentation tool for object detection, localization, and keypoints

:pencil2: Web-based image segmentation tool for object detection, localization, and keypoints - jsbroks/coco-annotator

CPU-based algorithm trains deep neural nets up to 15 times faster than top

Rice University computer scientists have demonstrated artificial intelligence (AI) software that runs on commodity processors and trains deep neural networks 15 times faster than platforms based on graphics ...

State of the art NLP at scale with RAPIDS, HuggingFace and Dask

See how to build end-to-end NLP pipelines in a fast and scalable way on GPUs — from feature engineering to inference.

The Dying ReLU Problem, Clearly Explained

Keep your neural network alive by understanding the downsides of ReLU

Reinforcement learning: The next great AI tech moving from the lab to the r

Reinforcement learning (RL) is a powerful type of AI technology that can learn strategies to optimally control large, complex systems.

When are Neural Networks more powerful than Neural Tangent Kernels?

Algorithms off the convex path.

Curious about Variational Autoencoders (VAEs)? Start Here.

In recent years, GANs (generative adversarial networks) have been all the rage in the field of deep-learning generative models, leaving…

AI-Controlled F-16s Are Now Working As A Team In DARPA's Virtual Dogfights

The dogfighting AI DARPA is developing is set to make the challenging migration from a synthetic environment to the real world soon.

PyTorch Lightning Documentation — PyTorch Lightning 1.3.0dev documentation

New AI tool detects Deepfakes by analyzing light reflections in the eyes

Computer scientists from the University at Buffalo used the method to successfully detect Deepfakes taken from This Person Does Not Exist.

[R] Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normaliz

340 votes, 28 comments. If anyone wants to brush up on recent methods in EBMs, Normalizing Flows, GANs, VAEs, and Autoregressive models, I just…

Deep Nostalgia AI brings your photos to life just like in the ‘Harry Potter

Deep Nostalgia AI brings your photos to life just like in the Harry Potter movies.

How to Use Roboflow and Streamlit to Visualize Object Detection Output

Building an app for blood cell count detection.

GPT-3: We’re at the very beginning of a new app ecosystem

The NLP application ecosystem is in its earliest stages, and it's not yet clear whether GPT-3 or a different model will be the foundation.

An Idea From Physics Helps AI See in Higher Dimensions

The laws of physics stay the same no matter one’s perspective. Now this idea is allowing computers to detect features in curved and higher-dimensional space.

[1902.04615] Gauge Equivariant Convolutional Networks and the Icosahedral C

The principle of equivariance to symmetry transformations enables a theoretically grounded approach to neural network architecture design. Equivariant networks have shown excellent performance and...

An overview of synthetic data types and generation methods

Synthetic data can be used to test new products and services, validate models, or test performances because it mimics the statistical property of production data. Today you'll find different types of structured and unstructured synthetic data.

Diving into different GAN architectures

Introduction

Error Backpropagation Learning Algorithm

The error backpropagation learning algorithm is a supervised learning technique for neural networks that calculates the gradient of descent for weighting different variables.

Why you should always use feature embeddings with structured datasets

A simple technique for boosting accuracy on ANY model you use

Data Science & AI Glossary | DeepAI

The data science and artificial intelligence terms you need while reading the latest research

Achieving High-Quality Search and Recommendation Results with DeepNLP

Speech and natural language processing (NLP) have become the foundation for most of the AI development in the enterprise today, as textual data represents a significant portion of unstructured content.

Math | Obviously Awesome

Activation functions are functions which take an input signal and convert it to an output signal. Activation functions introduce…

[1605.09782v6] Adversarial Feature Learning

The ability of the Generative Adversarial Networks (GANs) framework to learn generative models mapping from simple latent distributions to arbitrarily complex data distributions has been...

Adversarial | Papers With Code

Browse 31 tasks • 61 datasets • 57

Math | Obviously Awesome

Recently, a colleague of mine asked me a few questions like “why do we have so many activation functions?”, “why is that one works better…

Math | Obviously Awesome

Activation functions are functions that we apply in neural networks after (typically) applying an affine transformation combining weights and input features. They are typically non-linear functions. The rectified linear unit, or ReLU, has been the most popular in the past decade, although the choice is architecture dependent and many alternatives have emerged in recent years. In this section, you will find a constantly updating list of activation functions.

Curve Circuits

Reverse engineering the curve detection algorithm from InceptionV1 and reimplementing it from scratch.

Text2Gestures: A Transformer-Based Network for Generating Emotive Body Gest

We present Text2Gestures, a transformer-based learning method to interactively generate emotive full-body gestures for virtual agents aligned with natural language text inputs. Our method generates…

TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models

The memory capacity of embedding tables in deep learning recommendation models (DLRMs) is increasing dramatically from tens of GBs to TBs across the industry. Given the fast growth in DLRMs, novel...

General Methods | Papers With Code

Browse 1109 deep learning methods for General.

Cross-Topic Argument Mining: Learning How to Classify Texts

Classifying cross-topic natural language texts based on their argumentative structure using deep learning

A concept in psychology is helping AI to better navigate our world | MIT Te

The concept: When we look at a chair, regardless of its shape and color, we know that we can sit on it. When a fish is in water, regardless of its location, it knows that it can swim. This is known as the theory of affordance, a term coined by psychologist James J. Gibson. It…

Reinforcement Learning Explained Visually (Part 6): Policy Gradients, step-

A Gentle Guide to the REINFORCE algorithm, in Plain English

Algorithms for Decision Making | Hacker News

Model Compression: A Look into Reducing Model Size

Why is Model Compression important? A significant problem in the arms race to produce more accurate models is complexity, which leads to…

Deep Learning Systems: Algorithms, Compilers, and Processors for Large-Scale Production

None

Papers with Code 2020 Review

Papers with Code indexes various machine learning artifacts — papers, code, results — to facilitate discovery and comparison. Using this…

Pocket - Anchor Boxes — The key to quality object detection

If you have ever had to tinker with anchor boxes, you were probably frustrated, confused and saying to yourself, “There must be another…

Anchor Boxes — The key to quality object detection

A recent article came out comparing public cloud providers’ face detection APIs. I was very surprised to see all of the detectors fail to…

Browse the State-of-the-Art in Machine Learning | Papers With Code

**Pose Estimation** is a computer vision task where the goal is to detect the position and orientation of a person or an object. Usually, this is done by predicting the location of specific keypoints like hands, head, elbows, etc. in case of Human Pose Estimation. A common benchmark for this task is [MPII Human Pose](https://paperswithcode.com/sota/pose-estimation-on-mpii-human-pose) ( Image credit: [Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose](https://github.com/Daniil-Osokin/lightweight-human-pose-estimation.pytorch) )

Computer Vision | Papers With Code

Dimensionality reduction is the task of reducing the dimensionality of a dataset. ( Image credit: [openTSNE](https://github.com/pavlin-policar/openTSNE) )

Methodology | Papers With Code

Bayesian Inference is a methodology that employs Bayes Rule to estimate parameters (and their full posterior).

Methodology | Papers With Code

**Transfer Learning** is a machine learning technique where a model trained on one task is re-purposed and fine-tuned for a related, but different task. The idea behind transfer learning is to leverage the knowledge learned from a pre-trained model to solve a new, but related problem. This can be useful in situations where there is limited data available to train a new model from scratch, or when the new task is similar enough to the original task that the pre-trained model can be adapted to the new problem with only minor modifications. ( Image credit: [Subodh Malgonde](https://medium.com/@subodh.malgonde/transfer-learning-using-tensorflow-52a4f6bcde3e) )

Methodology | Papers With Code

**Representation Learning** is a process in machine learning where algorithms extract meaningful patterns from raw data to create representations that are easier to understand and process. These representations can be designed for interpretability, reveal hidden features, or be used for transfer learning. They are valuable across many fundamental machine learning tasks like [image classification](/task/image-classification) and [retrieval](/task/image-retrieval). Deep neural networks can be considered representation learning models that typically encode information which is projected into a different subspace. These representations are then usually passed on to a linear classifier to, for instance, train a classifier. Representation learning can be divided into: - **Supervised representation learning**: learning representations on task A using annotated data and used to solve task B - **Unsupervised representation learning**: learning representations on a task in an unsupervised way (label-free data). These are then used to address downstream tasks and reducing the need for annotated data when learning news tasks. Powerful models like [GPT](/method/gpt) and [BERT](/method/bert) leverage unsupervised representation learning to tackle language tasks. More recently, [self-supervised learning (SSL)](/task/self-supervised-learning) is one of the main drivers behind unsupervised representation learning in fields like computer vision and NLP. Here are some additional readings to go deeper on the task: - [Representation Learning: A Review and New Perspectives](/paper/representation-learning-a-review-and-new) - Bengio et al. (2012) - [A Few Words on Representation Learning](https://sthalles.github.io/a-few-words-on-representation-learning/) - Thalles Silva ( Image credit: [Visualizing and Understanding Convolutional Networks](https://arxiv.org/pdf/1311.2901.pdf) )

Object Tracking | Papers With Code

**Object tracking** is the task of taking an initial set of object detections, creating a unique ID for each of the initial detections, and then tracking each of the objects as they move around frames in a video, maintaining the ID assignment. State-of-the-art methods involve fusing data from RGB and event-based cameras to produce more reliable object tracking. CNN-based models using only RGB images as input are also effective. The most popular benchmark is OTB. There are several evaluation metrics specific to object tracking, including HOTA, MOTA, IDF1, and Track-mAP. ( Image credit: [Towards-Realtime-MOT ](https://github.com/Zhongdao/Towards-Realtime-MOT) )

Computer Vision | Papers With Code

**Image Retrieval** is a fundamental and long-standing computer vision task that involves finding images similar to a provided query from a large database. It's often considered as a form of fine-grained, instance-level classification. Not just integral to image recognition alongside [classification](/task/image-classification) and [detection](/task/image-detection), it also holds substantial business value by helping users discover images aligning with their interests or requirements, guided by visual similarity or other parameters. ( Image credit: [DELF](https://github.com/tensorflow/models/tree/master/research/delf) )

Computer Vision | Papers With Code

**Zero-shot learning (ZSL)** is a model's ability to detect classes never seen during training. The condition is that the classes are not known during supervised learning. Earlier work in zero-shot learning use attributes in a two-step approach to infer unknown classes. In the computer vision context, more recent advances learn mappings from image feature space to semantic space. Other approaches learn non-linear multimodal embeddings. In the modern NLP context, language models can be evaluated on downstream tasks without fine tuning. Benchmark datasets for zero-shot learning include [aPY](/dataset/apy), [AwA](/dataset/awa2-1), and [CUB](/dataset/cub-200-2011), among others. ( Image credit: [Prototypical Networks for Few shot Learning in PyTorch ](https://github.com/orobix/Prototypical-Networks-for-Few-shot-Learning-PyTorch) ) Further readings: - [Zero-Shot Learning -- A Comprehensive Evaluation of the Good, the Bad and the Ugly](https://paperswithcode.com/paper/zero-shot-learning-a-comprehensive-evaluation) - [Zero-Shot Learning in Modern NLP](https://joeddav.github.io/blog/2020/05/29/ZSL.html) - [Zero-Shot Learning for Text Classification](https://amitness.com/2020/05/zero-shot-text-classification/)

Computer Vision | Papers With Code

**Anomaly Detection** is a binary classification identifying unusual or unexpected patterns in a dataset, which deviate significantly from the majority of the data. The goal of anomaly detection is to identify such anomalies, which could represent errors, fraud, or other types of unusual events, and flag them for further investigation. [Image source]: [GAN-based Anomaly Detection in Imbalance Problems](https://paperswithcode.com/paper/gan-based-anomaly-detection-in-imbalance)

Computer Vision | Papers With Code

**Few-Shot Learning** is an example of meta-learning, where a learner is trained on several related tasks, during the meta-training phase, so that it can generalize well to unseen (but related) tasks with just few examples, during the meta-testing phase. An effective approach to the Few-Shot Learning problem is to learn a common representation for various tasks and train task specific classifiers on top of this representation. Source: [Penalty Method for Inversion-Free Deep Bilevel Optimization ](https://arxiv.org/abs/1911.03432)

Computer Vision | Papers With Code

**Depth Estimation** is the task of measuring the distance of each pixel relative to the camera. Depth is extracted from either monocular (single) or stereo (multiple views of a scene) images. Traditional methods use multi-view geometry to find the relationship between the images. Newer methods can directly estimate depth by minimizing the regression loss, or by learning to generate a novel view from a sequence. The most popular benchmarks are KITTI and NYUv2. Models are typically evaluated according to a RMS metric. Source: [DIODE: A Dense Indoor and Outdoor DEpth Dataset ](https://arxiv.org/abs/1908.00463)

Computer Vision | Papers With Code

**Facial Recognition** is the task of making a positive identification of a face in a photo or video image against a pre-existing database of faces. It begins with detection - distinguishing human faces from other objects in the image - and then works on identification of those detected faces. The state of the art tables for this task are contained mainly in the consistent parts of the task : the face verification and face identification tasks. ( Image credit: [Face Verification](https://shuftipro.com/face-verification) )

Computer Vision | Papers With Code

**Action Recognition** is a computer vision task that involves recognizing human actions in videos or images. The goal is to classify and categorize the actions being performed in the video or image into a predefined set of action classes. In the video domain, it is an open question whether training an action classification network on a sufficiently large dataset, will give a similar boost in performance when applied to a different temporal task or dataset. The challenges of building video datasets has meant that most popular benchmarks for action recognition are small, having on the order of 10k videos. Please note some benchmarks may be located in the [Action Classification](https://paperswithcode.com/task/action-classification) or [Video Classification](https://paperswithcode.com/task/video-classification) tasks, e.g. Kinetics-400.

Computer Vision | Papers With Code

**Denoising** is a task in image processing and computer vision that aims to remove or reduce noise from an image. Noise can be introduced into an image due to various reasons, such as camera sensor limitations, lighting conditions, and compression artifacts. The goal of denoising is to recover the original image, which is considered to be noise-free, from a noisy observation. ( Image credit: [Beyond a Gaussian Denoiser](https://arxiv.org/pdf/1608.03981v1.pdf) )

Computer Vision | Papers With Code

**Super-Resolution** is a task in computer vision that involves increasing the resolution of an image or video by generating missing high-frequency details from low-resolution input. The goal is to produce an output image with a higher resolution than the input image, while preserving the original content and structure. ( Credit: [MemNet](https://github.com/tyshiwo/MemNet) )

Computer Vision | Papers With Code

Autonomous driving is the task of driving a vehicle without human conduction. Many of the state-of-the-art results can be found at more general task pages such as [3D Object Detection](https://paperswithcode.com/task/3d-object-detection) and [Semantic Segmentation](https://paperswithcode.com/task/semantic-segmentation). (Image credit: [Exploring the Limitations of Behavior Cloning for Autonomous Driving](https://arxiv.org/pdf/1904.08980v1.pdf))

Computer Vision | Papers With Code

Data augmentation involves techniques used for increasing the amount of data, based on different modifications, to expand the amount of examples in the original dataset. Data augmentation not only helps to grow the dataset but it also increases the diversity of the dataset. When training machine learning models, data augmentation acts as a regularizer and helps to avoid overfitting. Data augmentation techniques have been found useful in domains like NLP and computer vision. In computer vision, transformations like cropping, flipping, and rotation are used. In NLP, data augmentation techniques can include swapping, deletion, random insertion, among others. Further readings: - [A Survey of Data Augmentation Approaches for NLP](https://paperswithcode.com/paper/a-survey-of-data-augmentation-approaches-for) - [A survey on Image Data Augmentation for Deep Learning](https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0) ( Image credit: [Albumentations](https://github.com/albumentations-team/albumentations) )

Computer Vision | Papers With Code

**Domain Adaptation** is the task of adapting models across domains. This is motivated by the challenge where the test and training datasets fall from different data distributions due to some factor. Domain adaptation aims to build machine learning models that can be generalized into a target domain and dealing with the discrepancy across domain distributions. Further readings: - [A Brief Review of Domain Adaptation](https://paperswithcode.com/paper/a-brief-review-of-domain-adaptation) ( Image credit: [Unsupervised Image-to-Image Translation Networks](https://arxiv.org/pdf/1703.00848v6.pdf) )

Browse the State-of-the-Art in Machine Learning | Papers With Code

**Image Generation** (synthesis) is the task of generating new images from an existing dataset. - **Unconditional generation** refers to generating samples unconditionally from the dataset, i.e. $p(y)$ - **[Conditional image generation](/task/conditional-image-generation)** (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$. In this section, you can find state-of-the-art leaderboards for **unconditional generation**. For conditional generation, and other types of image generations, refer to the subtasks. ( Image credit: [StyleGAN](https://github.com/NVlabs/stylegan) )

Browse the State-of-the-Art in Machine Learning | Papers With Code

**Object Detection** is a computer vision task in which the goal is to detect and locate objects of interest in an image or video. The task involves identifying the position and boundaries of objects in an image, and classifying the objects into different categories. It forms a crucial part of vision recognition, alongside [image classification](/task/image-classification) and [retrieval](/task/image-retrieval). The state-of-the-art methods can be categorized into two main types: one-stage methods and two stage-methods: - One-stage methods prioritize inference speed, and example models include YOLO, SSD and RetinaNet. - Two-stage methods prioritize detection accuracy, and example models include Faster R-CNN, Mask R-CNN and Cascade R-CNN. The most popular benchmark is the MSCOCO dataset. Models are typically evaluated according to a Mean Average Precision metric. ( Image credit: [Detectron](https://github.com/facebookresearch/detectron) )

Browse the State-of-the-Art in Machine Learning | Papers With Code

**Image Classification** is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike [object detection](/task/object-detection), which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as [image retrieval](/task/image-retrieval), which also involves finding similar images in a large database. Source: [Metamorphic Testing for Object Detection Systems ](https://arxiv.org/abs/1912.12162)

Browse the State-of-the-Art in Machine Learning | Papers With Code

**Semantic Segmentation** is a computer vision task in which the goal is to categorize each pixel in an image into a class or object. The goal is to produce a dense pixel-wise segmentation map of an image, where each pixel is assigned to a specific class or object. Some example benchmarks for this task are Cityscapes, PASCAL VOC and ADE20K. Models are usually evaluated with the Mean Intersection-Over-Union (Mean IoU) and Pixel Accuracy metrics. ( Image credit: [CSAILVision](https://github.com/CSAILVision/semantic-segmentation-pytorch) )

Everything Product People Need to Know About Transformers, GPT-3, and Huggi

Or, How to Act Like You Know About the Biggest AI Development since CNNs

A Beginner’s Guide to Use BERT for the First Time

From predicting single sentence to fine-tuning using custom dataset to finding the best hyperparameter configuration.

Favorites

Source normalization

All Personal Feeds

Candidate pair generation and initial match scoring

Practical Guide to Entity Resolution — part 5

Match scoring iteration

Farewell RNNs, Welcome TCNs

How Temporal Convolutional Networks are moving in favor of Sequence Modeling — Stock Trend Prediction.

Semantic hand segmentation using Pytorch

Semantic segmentation is the task of predicting the class of each pixel in an image. This problem is more difficult than object detection…

A version of the BERT language model that’s 20 times as fast

Determining the optimal architectural parameters reduces network size by 84% while improving performance on natural-language-understanding tasks.

storytelling arxiv paperswithcode at DuckDuckGo

DuckDuckGo. Privacy, Simplified.

YOLO v4 or YOLO v5 or PP-YOLO? Which should I use?

What are these new YOLO releases in 2020? How do they differ? Which one should I use?

AI system for high precision recognition of hand gestures

Scientists have developed an Artificial Intelligence (AI) system that recognises hand gestures by combining skin-like electronics with computer vision.

10 Invaluable Tips & Tricks for Building Successful Neural Networks

Bringing structure to an unstructured task

Introduction to Federated Learning

Federated learning means enabling on-device training, model personalization, and more. Read more about it in this article.

The Ultimate Guide to Transfer Learning

What is Transfer Learning? Where can I use it? Why should I use it? How can I use it? Read On to find out!

An Intuitive Guide to Auto-Encoders: Theory, Code and Visualization

Using Auto-Encoders to gain insight into data

Periodic Table of Deep Learning Patterns / Via DataCamp

443K subscribers in the learnmachinelearning community. A subreddit dedicated to learning machine learning

Computer Vision Recipes: Best Practices and Examples

This is an overview of a great computer vision resource from Microsoft, which demonstrates best practices and implementation guidelines for a variety of tasks and scenarios.

AI researchers use heartbeat detection to identify deepfake videos

AI researchers are using heartbeat detection to identify deepfake videos and even to figure out what kind of generative model created a deepfake.

Yolo v5 Object Detection Tutorial

How to set up and train a Yolo v5 Object Detection model?

QRNN: A Potential Competitor to the Transformer

Training Faster RNNs with Quasi-RNN

High-Performance, Billion-Scale Similarity Search | by Pat Lasserre | GSI T

In Part 1 of this series, we introduced the concept of embedding vectors. In Part 2, we discussed how embedding vectors can be used in…

Papers with Code arXiv = Reproducible, Organized Research

Through a joint collaboration, Papers with Code now provides category classification and code references for articles in the arXiv…

Deep Learning's Most Important Ideas - A Brief Historical Review

GPT-3, transformers and the wild world of NLP

A review of 20+ deep learning NLP models and how to use them well

Image Annotation for Computer Vision | CloudFactory

Machine learning is often fueled by image data. In this guide, learn the basics about image annotation, common techniques, and key workforce considerations.

The Most Complete Guide to PyTorch for Data Scientists

PyTorch has sort of became one of the de facto standards for creating Neural Networks now, and I love its interface.

Which GPUs to get for deep learning

Here, I provide an in-depth analysis of GPUs for deep learning/machine learning and explain what is the best GPU for your use-case and budget.

End to End Pipeline for setting up Multiclass Image Classification for Data Scientists - MLWhiz

In this post, we’ll create an end to end pipeline for image multiclass classification using Pytorch.This will include training the model, putting the model’s results in a form that can be shown to business partners, and functions to help deploy the model easily. As an added feature we will look at Test Time Augmentation using Pytorch also.

AlphaGo Zero Explained In One Diagram

One infographic that explains how Reinforcement Learning, Deep Learning and Monte Carlo Search Trees are used in AlphaGo Zero.

AI Papers to Read in 2020

Reading suggestions to keep you up-to-date with the latest and classic breakthroughs in AI and Data Science

AI devs created a lean, mean, GPT-3-beating machine that uses 99.9% fewer p

AI researchers from the Ludwig Maximilian University (LMU) of Munich have developed a bite-sized text generator capable of besting OpenAI’s state of the art GPT-3 using only a tiny fraction of its parameters. GPT-3 is a monster of an AI sys

AI Democratization in the Era of GPT-3

What does Microsoft getting an "exclusive license" to GPT-3 mean for the future of AI democratization?

Reinforcement Learning frameworks

Proximal Policy Optimization using RLlib-Ray

An Intuitive Guide to LSTMs

by creating them from scratch

Understanding Transformers, the Data Science Way

Read this accessible and conversational article about understanding transformers, the data science way — by asking a lot of questions that is.

Autoencoders: Overview of Research and Applications

Which kinds of autoencoders exist and what are their applications?

How to cluster images based on visual similarity

Use a pre-trained neural network for feature extraction and cluster images using K-means.

Novel object captioning surpasses human performance on benchmarks

Visual vocabulary advances novel object captioning by breaking free of paired sentence-image training data in vision and language pretraining. Discover how this method helps set new state of the art on the nocaps benchmark and bests CIDEr scores of humans.

5 Articles to Understand Generative Adversarial Networks

Get up to date on one of the most promising Deep Learning technologies there is right now

Hacked Billboards can Make Teslas See 'Phantom Objects' and Cause Havoc

Tesla's Autopilot system relies on vision rather than LIDAR, which means it can be tricked by messages on billboards and projections created by hackers.

New AI Inferencing Records - IEEE Spectrum

Nvidia tops MLPerf records again, consortium adds benchmarks to measure mobile

Machine Learning Attack Series: Image Scaling Attacks · wunderwuzzi blog

Su17kgm7y t

Amazon team adds key programming frameworks to Dive into Deep Learning book

With PyTorch and TensorFlow incorporated, the authors hope to gain a wider audience.

Self-Organizing Maps for Dimension Reduction, Data Visualization, and Clust

Self-Organizing Maps for Dimension Reduction, Data Visualization, and Clustering

Oil Storage Tank’s Volume Occupancy On Satellite Imagery Using YoloV3

Recognition of Oil Storage Tanks in satellite images using the Yolov3 object detection model from scratch using Tensorflow 2.x and…

New Approaches to Object Detection

A brief introduction to CenterNet (Objects as Points), TTFNet and their implementation in TensorFlow 2.2+.

11 Essential Neural Network Architectures, Visualized & Explained

Standard, Recurrent, Convolutional, & Autoencoder Networks

Where We See Shapes, AI Sees Textures | Quanta Magazine

To researchers’ surprise, deep learning vision algorithms often fail at classifying images because they mostly take cues from textures, not shapes.

YOLOv5 is Here: State-of-the-Art Object Detection at 140 FPS

Less than 50 days after the release YOLOv4, YOLOv5 improves accessibility for realtime object detection. June 29, YOLOv5 has released the first official version of the repository. We wrote a new deep dive on YOLOv5. June 12, 8:08 AM CDT Update: In response to to community feedback, we have

12 Main Dropout Methods : Mathematical and Visual Explanation

Deep Dive into DNNs, CNNs, and RNNs Dropout Methods for Regularization, Monte Carlo Uncertainty, and Model Compression

Image Augmentation Mastering: 15 Techniques and Useful Functions with Pyth

Smooth python codes to augment your image datasets by yourself.

Curve Detectors

Part one of a three part deep dive into the curve neuron family.

Implementing Deep Convolutional Generative Adversarial Networks (DCGAN)

How I Generated New Images from Random Data using DCGAN

6 GAN Architectures You Really Should Know

Complete Architectural Details of all EfficientNet Models

Let’s dive deep into the architectural details of all the different EfficientNet Models and find out how they differ from each other.

Virtual Background in Webcam with Body Segmentation Technique

Webcam background change is not limited to Zoom now, I just did it in the browser with tensorflow.js body-pix model

Data Augmentation in YOLOv4

State of the art modeling with image data augmentation and management

Image Segmentation With 5 Lines 0f Code

Computer vision is evolving on a daily basis. Popular computer vision techniques such as image classification and object detection have been used extensively to solve a lot of computer vision…

Classification of Brain MRI as Tumor/Non Tumor

Transformers for Multilabel Classification

BERT, XLNet, RoBERTa, etc.

Illustrated Guide to Transformer

A component by component breakdown analysis

AI Paper Recommendations from Experts

We reached out to further members of the AI community for their recommendations of papers which everyone should be reading! All of the cited papers are free to access and cover a range of topics from some incredible minds.

Evolution of Language Models: N-Grams, Word Embeddings, Attention & Transfo

This post collates research on the advancements of Natural Language Processing (NLP) over the years.

Understanding Associative Embedding

An elegant method to group predictions without labeling

AI and Efficiency

We’re releasing an analysis showing that since 2012 the amount of compute needed to train a neural net to the same performance on ImageNet classification has been decreasing by a factor of 2 every 16 months. Compared to 2012, it now takes 44 times less compute to train a neural network to the level of AlexNet (by contrast, Moore’s Law would yield an 11x cost improvement over this period). Our results suggest that for AI tasks with high levels of recent investment, algorithmic progress has yielded more gains than classical hardware efficiency.

Complete guide to machine learning and deep learning in retail

The stores aren’t dead yet

AI for 3D Generative Design

Making the design process faster and more efficient by generating 3D objects from natural language descriptions.

facebookresearch/faiss: A library for efficient similarity search and clust

A library for efficient similarity search and clustering of dense vectors. - facebookresearch/faiss

Master the COCO Dataset for Semantic Image Segmentation

Explore and manipulate the COCO image dataset for Semantic Image Segmentation with PyCoco, Tensorflow Keras Python libraries

Master the COCO Dataset for Semantic Image Segmentation

Create a data generator and train your model on the COCO image dataset for Semantic Image Segmentation with PyCoco, Tensorflow Keras py

3D Photography Inpainting: Exploring Art with AI.

Usage of the new model, with examples and Colab Notebook.

Python Libraries for Natural Language Processing - Towards Data Science

An Overview Of popular python libraries for Natural Language Processing

Deploy Tensorflow Object Detection API on Kubernetes with Python, Flask and

In this article we’ll serve the Tensorflow Object Detection API with Flask, Dockerize the Application and deploy it on Kubernetes.

[R] Suprise: Exponentially increasing Learning Rate for Deep Learning

Paper: https://arxiv.org/abs/1910.07454 Blog Post: http://www.offconvex.org/2020/04/24/ExpLR1/ "We report experiments that state-of-the-art networks…

OpenAI Open Sources Microscope and the Lucid Library to Visualize Neurons in Deep Neural Networks - KDnuggets

The new tools shows the potential of data visualizations for understanding features in a neural network.

Stacked Auto-encoder as a Recommendation System for Movie Rating Prediction

Introduction on Stacked Auto-encoder and Technical Walk-through on Model Creation using Pytorch

The Cost of Training NLP Models: A Concise Overview

We review the cost of training large-scale language models, and the drivers of these costs. The intended audience includes engineers and scientists budgeting their model-training experiments, as...

RecSys Series Part 5: Neural Matrix Factorization for Collaborative Filteri

Bringing Neural Architecture into Recommendations

Google says new AI models allow for ‘nearly instantaneous’ weather forecast

AI looks well-suited for short-term weather forecasts

How robots can adapt to new tasks — quickly

New approach to meta-reinforcement learning minimizes the need for costly interactions with the environment.

Topic Modeling Articles with NMF

Extracting topics is a good unsupervised data-mining technique to discover the underlying relationships between texts. There are many…

Build an app to generate photorealistic faces using TensorFlow and Streamli

We’ll show you how to quickly build a Streamlit app to synthesize celebrity faces using GANs, Tensorflow, and st.cache.

Some shirts hide you from cameras—but will anyone wear them?

It’s theoretically possible to become invisible to cameras. But can it catch on?

Limitations of Graph Neural Networks

Reading between the lines of the latest advancements in GML.

50 Deep Learning Interview Questions

NLP — BERT & Transformer - Jonathan Hui - Medium

Google published an article “Understanding searches better than ever before” and positioned BERT as one of the most important updates to…

Test Your Skills: 26 (More) Data Science Interview Questions & Answers

Statistics, Algorithms, Deep Learning, NLP, & Data Organization

Object Detection using YoloV3 and OpenCV

An Introduction to Object Detection with YoloV3 for beginners

Image Data Labelling and Annotation — Everything you need to know

Learn about different types of annotations, annotation formats and annotation tools

nandinib1999/object-detection-yolo-opencv: Object Detection using Yolo V3 and OpenCV

Object Detection using Yolo V3 and OpenCV .

TLDR This - Article Summarizer & Online Text Summarizing Tool

TLDR This is a Free online text summarizing tool that automatically condenses long articles, documents, essays, or papers into key summary paragraphs using state-of-the-art AI.

Matrix Factorization as a Recommender System

An Explanation and Implementation of Matrix Factorization

google-research/bert: TensorFlow code and pre-trained models for BERT

TensorFlow code and pre-trained models for BERT.

The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) – J

Discussions: Hacker News (98 points, 19 comments), Reddit r/MachineLearning (164 points, 20 comments) Translations: Chinese (Simplified), French 1, French 2, Japanese, Korean, Persian, Russian, Spanish 2021 Update: I created this brief and highly accessible video intro to BERT The year 2018 has been an inflection point for machine learning models handling text (or more accurately, Natural Language Processing or NLP for short). Our conceptual understanding of how best to represent words and sentences in a way that best captures underlying meanings and relationships is rapidly evolving. Moreover, the NLP community has been putting forward incredibly powerful components that you can freely download and use in your own models and pipelines (It’s been referred to as NLP’s ImageNet moment, referencing how years ago similar developments accelerated the development of machine learning in Computer Vision tasks).

Jay Alammar – Visualizing machine learning one concept at a time

Visualizing machine learning one concept at a time.

Disrupting Deepfakes: Adversarial Attacks on Image Translation Networks (Co

🔥🔥Defending Against Deepfakes Using Adversarial Attacks on Conditional Image Translation Networks - natanielruiz/disrupting-deepfakes

Building an Image-Taking Interface Application for Your Image Recognition M

Explore the Real-World Applications of Your Model

Brain Tumor Detection using Mask R-CNN

Mask R-CNN has been the new state of the art in terms of instance segmentation. Here I want to share some simple understanding of it to give you a first look and then we can move ahead and build our model.

Object detection & Face recognition algorithms

Convolutional Neural Networks-Part 2: Detailed convolutional architectures enabling object-detection and face-recognition algorithms.

Big data's biggest secret: Hyperparameter tuning

The toughest part of machine learning with Spark isn't what you think it is.

Benchmark Work | Benchmarks MLCommons

MLCommons ML benchmarks help balance the benefits and risks of AI through quantitative tools that guide responsible AI development.

How to Get Beautiful Results with Neural Style Transfer

A deep dive into the tricks that make Neural Style Transfer work

Spatial Transformer Network

A spatial transformer network is a specialized type of convoluted neural network, or CNN, used to improve the clarity of an object in an image.

Why BERT Fails in Commercial Environments - Intel AI

Using Snorkel For Multi-Label Annotation.

How to use snorkel’s multi-class implementation to create multi-labels

Researchers detail TrojAI, a framework for hardening AI models against adve

In a preprint paper, researchers at Johns Hopkins detail TrojAI, a framework for hardening AI models against adversarial attacks.

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of...

Getting started with the NVIDIA Jetson Nano - PyImageSearch

In this tutorial, you will learn how to get started with your NVIDIA Jetson Nano, including installing Keras + TensorFlow, accessing the camera, and performing image classification and object detection.

A Journey Into Reinforcement Learning — Temporal-Difference Learning

Optimizing value functions by bootstrapping through experience.

CompressionVAE — A Powerful and Versatile Alternative to t-SNE and UMAP

Introducing a fast, easy to use, deep learning based dimensionality reduction tool

Vincent Boucher on LinkedIn: #transformer #bert #nlp

Pre-training SmallBERTa - A tiny model to train on a tiny dataset An end to end colab notebook that allows you to train your own LM (using HuggingFace…

Deep Transfer Learning for Image Classification

A step-by-step tutorial from data import to accuracy evaluation

The Mechanics of Attention Mechanism

TLDR: This is basically about converting the original attention paper by Yoshua Bengio’s group to flowcharts. Check the last diagram…

Deep Learning Algorithms - The Complete Guide | AI Summer

All the essential Deep Learning Algorithms you need to know including models used in Computer Vision and Natural Language Processing

Dissecting The Transformer

We saw how attention works and how it improved neural machine translation systems (see the previous blogpost), we are going to unveil the secrets behind the power of the most famous NLP models nowadays (a.k.a BERT and friends), the transformer. In this second part, we are going to dive into the details of this architecture with the aim of […]

MIT Technology Review on LinkedIn: A little-known AI method can train on your health data without threatening…

This little-known method could very well be the answer to the greatest obstacle facing artificial intelligence's adoption in health care. (from March)

Easy Image Dataset Augmentation with TensorFlow - KDnuggets

What can we do when we don't have a substantial amount of varied training data? This is a quick intro to using data augmentation in TensorFlow to perform in-memory image transformations during model training to help overcome this data impediment.

Learning to See Transparent Objects

Posted by Shreeyak Sajjan, Research Engineer, Synthesis AI and Andy Zeng, Research Scientist, Robotics at Google Optical 3D range sensors, like R...

Over 150 of the Best Machine Learning, NLP, and Python Tutorials I’ve Found

By popular demand, I’ve updated this article with the latest tutorials from the past 12 months. Check it out here

fastai/README.md at master · fastai/fastai

The fastai deep learning library.

Q-Learning

An early breakthrough in reinforcement learning — Off-policy Temporal-Difference control methods

Quick Introduction to Sentiment Analysis

What is sentiment analysis, how to perform it, and how it can help your business.

Variational Autoencoders

VAE and where to find them

Transformers

Transformers are a type of neural network architecture that have been gaining popularity. Transformers were recently used by OpenAI in…

Dive Really Deep into YOLO v3: A Beginner’s Guide

443K subscribers in the learnmachinelearning community. A subreddit dedicated to learning machine learning

Altmetric – Top 100 articles – 2019

What research caught the public imagination in 2019? Check out our annual list of papers with the most attention.

How to train a new language model from scratch using Transformers and Tokenizers

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Reformer: The Efficient Transformer

Posted by Nikita Kitaev, Student Researcher, UC Berkeley and Łukasz Kaiser, Research Scientist, Google Research Understanding sequential data — s...

Feature Boosting Network For 3D Pose Estimation

In this paper, a feature boosting network is proposed for estimating 3D hand pose and 3D body pose from a single RGB image. In this method, the features learned by the convolutional layers are boosted with a new long short-term dependence-aware (LSTD) module, which enables the intermediate convolutional feature maps to perceive the graphical long short-term dependency among different hand (or body) parts using the designed Graphical ConvLSTM. Learning a set of features that are reliable and discriminatively representative of the pose of a hand (or body) part is difficult due to the ambiguities, texture and illumination variation, and self-occlusion in the real application of 3D pose estimation. To improve the reliability of the features for representing each body part and enhance the LSTD module, we further introduce a context consistency gate (CCG) in this paper, with which the convolutional feature maps are modulated according to their consistency with the context representations. We evaluate the proposed method on challenging benchmark datasets for 3D hand pose estimation and 3D full body pose estimation. Experimental results show the effectiveness of our method that achieves state-of-the-art performance on both of the tasks.

Mask R-CNN

We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition. Code has been made available at: https://github.com/facebookresearch/Detectron.

Focal Loss for Dense Object Detection

The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. To evaluate the effectiveness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors. Code is at: https://github.com/facebookresearch/Detectron.

A neural net solves the three-body problem 100 million times faster

Machine learning provides an entirely new way to tackle one of the classic problems of applied mathematics.

Table Detection and Extraction Using Deep Learning

Extract table from image with Nanonets table detection OCR. Learn OCR table Deep Learning methods to detect tables in images or PDF documents.

2020 Guide to Synthetic Media | Paperspace Blog

From deepfakes and virtual celebrities to "fake news," we'll cover popular cases of media synthesis and the research publications detailing how it's done.

Turing-NLG: A 17-billion-parameter language model by Microsoft

This figure was adapted from a similar image published in DistilBERT. Turing Natural Language Generation (T-NLG) is a 17 billion parameter language model by Microsoft that outperforms the state of the art on many downstream NLP tasks. We present a demo of the model, including its freeform generation, question answering, and summarization capabilities, to academics […]

Understanding GauGAN Part 1 | Paperspace Blog

In this article we explain what GauGANs are, and how their architecture and objective functions work. This is part of a series on Nvidia GauGANs.

Luminovo - Deep Learning Toolset.pdf - Google Drive

Serving GPT-2 in Google Cloud Platform

A CloudOps Journey

An End to End Introduction to GANs using Keras - MLWhiz

I bet most of us have seen a lot of AI-generated people faces in recent times, be it in papers or blogs. We have reached a stage where it is becoming increasingly difficult to distinguish between actual human faces and faces that are generated by Artificial Intelligence. In this post, I will help the reader to understand how they can create and build such applications on their own. I will try to keep this post as intuitive as possible for starters while not dumbing it down too much. This post is about understanding how GANs work.

Automatic Text Summarization in a Nutshell - KDnuggets

Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California about Automatic Text Summarization and the various ways it is used.

Semantic Segmentation — Popular Architectures

Doing cool things with data!

5 Techniques to Prevent Overfitting in Neural Networks

In this article, I will present five techniques to prevent overfitting while training neural networks.

The Neural Network Zoo - The Asimov Institute

With new neural network architectures popping up every now and then, it’s hard to keep track of them all. Knowing all the abbreviations being thrown around (DCIGN, BiLSTM, DCGAN, anyone?) can be a bit overwhelming at first. So I decided to compose a cheat sheet containing many of those architectures. Most of these are neural networks, some are completely […]

Neural Networks 201: All About Autoencoders

Autoencoders can be a very powerful tool for leveraging unlabeled data to solve a variety of problems, such as learning a "feature extractor" that helps build powerful classifiers, finding anomalies, or doing a Missing Value Imputation.

Research Guide: Model Distillation Techniques for Deep Learning

Knowledge distillation is a model compression technique whereby a small network (student) is taught by a larger trained neural network (teacher). The smaller network is trained to behave like the large neural network. This enables the deployment of such models… Continue reading Research Guide: Model Distillation Techniques for Deep Learning

Computing Receptive Fields of Convolutional Neural Networks

Detailed derivations and open-source code to analyze the receptive fields of convnets.

The 5 Algorithms for Efficient Deep Learning Inference on Small Devices - Fritz ai

With recent developments in deep learning, neural networks are getting larger and larger. For example, in the ImageNet recognition challenge, the winning model, from 2012 to 2015, increased in size by 16 times. And in just one year, for Baidu’s… Continue reading The 5 Algorithms for Efficient Deep Learning Inference on Small Devices

Demystifying Object Detection and Instance Segmentation for Data Scientists

Easy Explanation!!! I tried

MrSyee/pg-is-all-you-need: Policy Gradient is all you need! A step-by-step

Policy Gradient is all you need! A step-by-step tutorial for well-known PG methods. - MrSyee/pg-is-all-you-need

Imaging technique spots colorectal tumors with 100% accuracy

A new method that provides accurate, real-time, computer-aided diagnosis of colorectal cancer identified tumors with 100% accuracy in a new pilot study.

Powerful computer vision algorithms are now small enough to run on your phone

Researchers have shrunk state-of-the-art computer vision models to run on low-power devices. Growing pains: Visual recognition is deep learning’s strongest skill. Computer vision algorithms are analyzing medical images, enabling self-driving cars, and powering face recognition. But training models to recognize actions in videos has grown increasingly expensive. This has fueled concerns about the technology’s carbon…

Research Guide: Advanced Loss Functions for Machine Learning Models - KDnuggets

This guide explores research centered on a variety of advanced loss functions for machine learning models.

Adit Deshpande – CS Undergrad at UCLA ('19)

Engineering at Forward | UCLA CS '19

Learning the Differences between Softmax and Sigmoid for Image Classificati

Week Two - 100 Days of Code Challenge

Keras Mask R-CNN - PyImageSearch

In this tutorial you will learn how to use Keras, Mask R-CNN, and Deep Learning for instance segmentation (both with and without a GPU).

Deep Learning: Which Loss and Activation Functions should I use?

The purpose of this post is to provide guidance on which combination of final-layer activation function and loss function should be used in…

Scaling Jupyter notebooks with Kubernetes and Tensorflow

In this article, you will explore how you can leverage Kubernetes, Tensorflow and Kubeflow to scale your models without having to worry about scaling the infrastructure.

Computer Vision for Beginners: Part 4

Contour detection and having a little bit of fun

Open Questions about Generative Adversarial Networks

What we'd like to find out about GANs that we don't know yet.

YOLO: Real-Time Object Detection

You only look once (YOLO) is a state-of-the-art, real-time object detection system.

Text Analytics

Medallia's text analytics software tool provides actionable insights via customer and employee experience sentiment data analysis from reviews & comments.

Generative Adversarial Networks - The Story So Far

Word2vec: fish music = bass | graceavery

Nvidia’s GauGAN has been used to create 500,000 images

Nvidia's GauGAN tool has been used to create more than 500,000 images, the company announced at the SIGGRAPH 2019 conference in Los Angeles.

One-Shot Learning: Learning More with Less Data

A 2019 guide to Human Pose Estimation with Deep Learning

Graph neural networks: a review of methods and applications | the morning paper

Is it a Duck or a Rabbit? For Google Cloud Vision, it depends how the image is rotated. [OC]

27K votes, 533 comments. 21M subscribers in the dataisbeautiful community. DataIsBeautiful is for visualizations that effectively convey information…

vdumoulin/conv_arithmetic: A technical report on convolution arithmetic in the context of deep learning

A technical report on convolution arithmetic in the context of deep learning - vdumoulin/conv_arithmetic

Financial Services

Learn how SymphonyAI’s financial crime prevention solutions quickly deploy to uncover your risks, improve investigations, and transform your operations.

Truly, neurally, deeply

Scientists are developing AI systems called deep neural nets that can read medical images and detect disease — with astonishing efficiency

When Recurrent Models Don't Need to be Recurrent

Algorithms off the convex path.

One Deep Learning Benchmark to Rule Them All

Over the last few years we have detailed the explosion in new machine learning systems with the influx of novel architectures from deep learning chip

https://blog.statsbot.co/data-structures-related-to-machine-learning-algorithms-5edf77c8bbf4

Slaney2008-LSHTutorial.pdf 📄

Neural networks for algorithmic trading. Multimodal and multitask deep learning

Here we are again! We already have four tutorials on financial forecasting with artificial neural networks where we compared different…

A Gentle Introduction to RNN Unrolling - MachineLearningMastery.com

Recurrent neural networks are a type of neural network where the outputs from previous time steps are fed as input to the current time step. This creates a network graph or circuit diagram with cycles, which can make it difficult to understand how information moves through the network. In this post, you will discover the concept of unrolling or unfolding…

The Matrix Calculus You Need For Deep Learning

Most of us last saw calculus in school, but derivatives are a critical part of machine learning, particularly deep neural networks, which are trained by optimizing a loss function. This article is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed.

agnusmaximus/Word2Bits: Quantized word vectors that take 8x-16x less space than regular word vectors

Quantized word vectors that take 8x-16x less space than regular word vectors - agnusmaximus/Word2Bits

LouieYang/deep-photo-styletransfer-tf: Tensorflow (Python API) implementati

Tensorflow (Python API) implementation of Deep Photo Style Transfer - LouieYang/deep-photo-styletransfer-tf

Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning

We present Deep Voice 3, a fully-convolutional attention-based neural text-to-speech (TTS) system. Deep Voice 3 matches state-of-the-art neural speech synthesis systems in naturalness while...

How to build a deep learning model in 15 minutes – tech-at-instacart

An open source framework for configuring, building, deploying and maintaining deep learning models in Python.

Generative Adversarial Networks (GANs): Engine and Applications - Data Scie

Generative adversarial networks (GANs) are a class of neural networks that are used in unsupervised machine learning. They help to solve such tasks as image generation from descriptions, getting high resolution images from low resolution ones, predicting which drug could treat a certain disease, retrieving images that contain a given pattern, etc. Our team asked… Read More »Generative Adversarial Networks (GANs): Engine and Applications

Learning to write programs that generate images

Through a human’s eyes, the world is much more than just the images reflected in our corneas. For example, when we look at a building and admire the intricacies of its design, we can appreciate...

kjw0612/awesome-rnn: Recurrent Neural Network - A curated list of resources

Recurrent Neural Network - A curated list of resources dedicated to RNN - kjw0612/awesome-rnn

Cambricon, Makers of Huawei's Kirin NPU IP, Build A Big AI Chip and PCIe Ca

Topic: computer-vision

GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.

AutonomousDrivingCookbook/AirSimE2EDeepLearning at master · Microsoft/Auton

Scenarios, tutorials and demos for Autonomous Driving - microsoft/AutonomousDrivingCookbook

Adit Deshpande – CS Undergrad at UCLA ('19)

Engineering at Forward | UCLA CS '19

Tearing Apart Google’s TPU 3.0 AI Coprocessor

Google did its best to impress this week at its annual IO conference. While Google rolled out a bunch of benchmarks that were run on its current Cloud TPU

LightTag is a text annotation platform for data scientists creating AI trai

LightTag, a newly launched startup from a former NLP researcher at Citi, has built a "text annotation platform" designed to assist data scientists who

Google Announces 8x Faster TPU 3.0 For AI, Machine Learning - ExtremeTech

Google's new TPUs are here -- and they're quite a bit faster than last year's model.

10 Command Line Recipes for Deep Learning on Amazon Web Services - Machine

Running large deep learning processes on Amazon Web Services EC2 is a cheap and effective way to learn and develop models. For just a few dollars you can get access to tens of gigabytes of RAM, tens of CPU cores, and multiple GPUs. I highly recommend it. If you are new to EC2 or the Linux command line, there are…

Baidu Apollo Releases Massive Self-driving Dataset; Teams Up With Berkeley DeepDrive

Baidu this Thursday announced the release of ApolloScape, billed as the world’s largest open-source dataset for autonomous driving…

Baidu’s voice cloning AI can swap genders and remove accents

China's tech titan Baidu just upgraded Deep Voice. The voice-cloning AI now works faster than ever and can swap a speaker's gender or change their accent.

1703.09039.pdf 📄

Choosing the right activation function in a neural network

Stay up-to-date on the latest data science and AI news in the worlds of artificial intelligence, machine learning, deep learning, implementation, and more.

Facebook open sources Detectron – Facebook Research

Region of interest pooling explained

Dive into our detailed explanation of what is Region of Interest (RoI) Pooling in deep learning. Enhance your skills. Discover more now!

One model to learn them all

The 3 Tricks That Made AlphaGo Zero Work

There were many advances in Deep Learning and AI in 2017, but few generated as much publicity and interest as DeepMind’s AlphaGo Zero. This…

Deep-Learning-Papers-Reading-Roadmap/README.md at master · songrotek/Deep-L

Deep Learning papers reading roadmap for anyone who are eager to learn this amazing tech! - floodsung/Deep-Learning-Papers-Reading-Roadmap

Train your deep model faster and sharper — two novel techniques

6 Deep Learning Techniques They Never Taught You In School

They can really help you

https://blog.openai.com/openai-baselines-ppo/

machine learning benchmarks - Google Search

An Intuitive Guide to Deep Network Architectures

How and why do different Deep Learning models work? We provide an intuitive explanation for 3 very popular DL models: Resnet, Inception, and Xception.

Gentle Introduction to Generative Long Short-Term Memory Networks - MachineLearningMastery.com

The Long Short-Term Memory recurrent neural network was developed for sequence prediction. In addition to sequence prediction problems. LSTMs can also be used as a generative model In this post, you will discover how LSTMs can be used as generative models. After completing this post, you will know: About generative models, with a focus on generative models for text called…

A Gentle Introduction to Exploding Gradients in Neural Networks

Exploding gradients are a problem where large error gradients accumulate and result in very large updates to neural network model weights during training. This has the effect of your model being unstable and unable to learn from your training data. In this post, you will discover the problem of exploding gradients with deep artificial neural networks. After completing this post,…

How Adversarial Attacks Work

Emil Mikhailov is the founder of XIX.ai [http://XIX.ai] (YC W17). Roman Trusov is a researcher at XIX.ai. Recent studies by Google Brain have shown that any machine learning classifier can be tricked to give incorrect predictions, and with a little bit of skill, you can get them to give pretty much any result you want. This fact steadily becomes worrisome as more and more systems are powered by artificial intelligence — and many of them are crucial for our safe and comfortable life. Banks, sur

wiseodd/generative-models: Collection of generative models, e.g. GAN, VAE i