vision

Mistral's Small 4 consolidates reasoning, vision and coding into one model — at a fraction of the inference cost

20 Mar 2026

venturebeat.com

Mistral's Small 4 combines reasoning, multimodal analysis and agentic coding in a single open-source model with configurable inference effort, offering enterprises a lower-cost alternative to running separate models for each task.

Autonomous Driving: Assessment Of YOLO Algorithms (RMIT et al.)

11 Feb 2026

semiengineering.com

A new technical paper titled “Advances in You Only Look Once (YOLO) algorithms for lane and object detection in autonomous vehicles” was published by RMIT University, Kyungpook National University, Deakin University and the RCA Robotics Laboratory, Royal College of Art. Abstract “Ensuring the safety and efficiency of Autonomous Vehicles (AVs) necessitates highly accurate perception, especially... » read more

Newfound 'Reality Signal' Helps the Brain Tell Imagination from Real Life

5 Sep 2025

scientificamerican.com

Seeing and imagining use similar brain machinery. New research reveals the brain circuit that identifies what is real, which may help scientists understand conditions such as schizophrenia

Is It Cake? How Our Brain Deciphers Materials - Nautilus

9 Jul 2025

nautil.us

Neuroscientists are discovering how this basic ability, essential to our survival, works

The ‘Profound’ Experience of Seeing a New Color

27 Apr 2025

theatlantic.com

The ecstasy of “olo”

Aphantasia: Inside The Brains Of People Who Have No Mind’s Eye

28 Jan 2025

vice.com

Did you know some people can’t see images in their minds? It’s a real issue—and it has a name: aphantasia.

A Visual Guide to Vision Transformers | MDTURP

16 Apr 2024

blog.mdturp.ch

This is a visual guide (scroll story) to Vision Transformers (ViTs), a class of deep learning models that have achieved state-of-the-art performance on image classification tasks.

Blinking is more than meets the eye

15 Apr 2024

futurity.org

Eye blinks aren't just a mechanism to keep our eyes moist. Research finds that blinking plays a key role in processing visual information.

Memory Recognition and Recall in User Interfaces

17 Jan 2024

nngroup.com

Recalling items from scratch is harder than recognizing the correct option in a list of choices because the extra context helps users retrieve information from memory.

My left and right eyes see slightly different colors. Is that normal?

31 Mar 2023

getpocket.com

Vox is a general interest news site for the 21st century. Its mission: to help everyone understand our complicated world, so that we can all help shape it. In text, video and audio, our reporters explain politics, policy, world affairs, technology, culture, science, the climate crisis, money, health and everything else that matters. Our goal is to ensure that everyone, regardless of income or status, can access accurate information that empowers them.

The Art of the Shadow: How Painters Have Gotten It Wrong for Centuries

25 Feb 2023

thereader.mitpress.mit.edu

The goal is not to expose the “slipups” of the masters but to understand the human brain.

[1702.04680v1] Visual Discovery at Pinterest

21 Dec 2022

arxiv.org

Over the past three years Pinterest has experimented with several visual search and recommendation services, including Related Pins (2014), Similar Looks (2015), Flashlight (2016) and Lens (2017)....

How ‘The Dress’ Sparked a Neuroscience Breakthrough

2 Jul 2022

wired.com

The color debate that broke the internet raised new questions about the relationship between perception and consciousness.

This Optical Illusion Has a Revelation About Your Brain and Eyes (Published 2022)

11 Jun 2022

nytimes.com

Your pupils may be dilating when you see images like this one as your brain tries to anticipate the near future.

Everything We See Is a Mash-up of the Brain’s Last 15 Seconds of Visual Information

20 Feb 2022

getpocket.com

The brain is basically a time machine that ensures what we see is stable and continuous.

Favorites

18 Dec 2020

towardsdatascience.com

Source normalization

All Personal Feeds

18 Dec 2020

towardsdatascience.com

Candidate pair generation and initial match scoring

Practical Guide to Entity Resolution — part 5

18 Dec 2020

towardsdatascience.com

Match scoring iteration

Semantic hand segmentation using Pytorch

18 Dec 2020

towardsdatascience.com

Semantic segmentation is the task of predicting the class of each pixel in an image. This problem is more difficult than object detection…

Cameras and Lenses – Bartosz Ciechanowski

11 Dec 2020

ciechanow.ski

Interactive article explaining how cameras and lenses work.

YOLO v4 or YOLO v5 or PP-YOLO? Which should I use?

10 Dec 2020

towardsdatascience.com

What are these new YOLO releases in 2020? How do they differ? Which one should I use?

AI system for high precision recognition of hand gestures

10 Dec 2020

sciencedaily.com

Scientists have developed an Artificial Intelligence (AI) system that recognises hand gestures by combining skin-like electronics with computer vision.

Favorites

9 Dec 2020

feedly.com

Welcome to Feedly — the platform where businesses and curious minds stay ahead of the curve! We're passionate about helping teams track competitors, discover new trends, and research emerging security threats. Feedly AI is a collection of machine learning models that automatically collect, analyze, and help you share actionable insights from millions of sources in real-time.

Computer Vision Recipes: Best Practices and Examples

3 Nov 2020

kdnuggets.com

This is an overview of a great computer vision resource from Microsoft, which demonstrates best practices and implementation guidelines for a variety of tasks and scenarios.

Yolo v5 Object Detection Tutorial

3 Nov 2020

towardsdatascience.com

How to set up and train a Yolo v5 Object Detection model?

What is Perspective Warping ? | OpenCV and Python

3 Nov 2020

towardsdatascience.com

A step-by-step guide to apply perspective transformation on images

Image Annotation for Computer Vision | CloudFactory

3 Nov 2020

info.cloudfactory.com

Machine learning is often fueled by image data. In this guide, learn the basics about image annotation, common techniques, and key workforce considerations.

End to End Pipeline for setting up Multiclass Image Classification for Data Scientists - MLWhiz

3 Nov 2020

mlwhiz.com

In this post, we’ll create an end to end pipeline for image multiclass classification using Pytorch.This will include training the model, putting the model’s results in a form that can be shown to business partners, and functions to help deploy the model easily. As an added feature we will look at Test Time Augmentation using Pytorch also.

How to cluster images based on visual similarity

2 Nov 2020

towardsdatascience.com

Use a pre-trained neural network for feature extraction and cluster images using K-means.

Novel object captioning surpasses human performance on benchmarks

2 Nov 2020

microsoft.com

Visual vocabulary advances novel object captioning by breaking free of paired sentence-image training data in vision and language pretraining. Discover how this method helps set new state of the art on the nocaps benchmark and bests CIDEr scores of humans.

Hacked Billboards can Make Teslas See 'Phantom Objects' and Cause Havoc

2 Nov 2020

newsweek.com

Tesla's Autopilot system relies on vision rather than LIDAR, which means it can be tricked by messages on billboards and projections created by hackers.

Machine Learning Attack Series: Image Scaling Attacks · wunderwuzzi blog

31 Oct 2020

embracethered.com

From fluffy to valuable: How the brain recognises objects

19 Oct 2020

mpg.de

To recognise a chair or a dog, our brain separates objects into their individual properties and then puts them back together. Until recently, it has remained unclear what these properties are. Scientists at the Max Planck Institute for Human Cognitive and Brain Sciences in Leipzig have now identified them - from "fluffy” to “valuable” - and found that all it takes is 49 properties to recognise almost any object.

A Mathematical Model Unlocks the Secrets of Vision

13 Oct 2020

getpocket.com

Mathematicians and neuroscientists have created the first anatomically accurate model that explains how vision is possible.

Oil Storage Tank’s Volume Occupancy On Satellite Imagery Using YoloV3

2 Sep 2020

towardsdatascience.com

Recognition of Oil Storage Tanks in satellite images using the Yolov3 object detection model from scratch using Tensorflow 2.x and…

New Approaches to Object Detection

2 Sep 2020

towardsdatascience.com

A brief introduction to CenterNet (Objects as Points), TTFNet and their implementation in TensorFlow 2.2+.

Why Red Means Red in Almost Every Language

10 Aug 2020

getpocket.com

The confounding consistency of color categories.

Where We See Shapes, AI Sees Textures | Quanta Magazine

10 Aug 2020

quantamagazine.org

To researchers’ surprise, deep learning vision algorithms often fail at classifying images because they mostly take cues from textures, not shapes.

FORTH-ModelBasedTracker/MocapNET: We present MocapNET, an ensemble of SNN e

11 Jul 2020

github.com

We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocula...

YOLOv5 is Here: State-of-the-Art Object Detection at 140 FPS

24 Jun 2020

blog.roboflow.ai

Less than 50 days after the release YOLOv4, YOLOv5 improves accessibility for realtime object detection. June 29, YOLOv5 has released the first official version of the repository. We wrote a new deep dive on YOLOv5. June 12, 8:08 AM CDT Update: In response to to community feedback, we have

Dimensionality Reduction in Hyperspectral Images using Python

24 Jun 2020

towardsdatascience.com

Dimensionality Reduction Techniques for Hyperspectral Images.

Image Augmentation Mastering: 15 Techniques and Useful Functions with Pyth

24 Jun 2020

towardsdatascience.com

Smooth python codes to augment your image datasets by yourself.

Curve Detectors

17 Jun 2020

distill.pub

Part one of a three part deep dive into the curve neuron family.

Data Augmentation in YOLOv4

1 Jun 2020

towardsdatascience.com

State of the art modeling with image data augmentation and management

Virtual Background in Webcam with Body Segmentation Technique

1 Jun 2020

towardsdatascience.com

Webcam background change is not limited to Zoom now, I just did it in the browser with tensorflow.js body-pix model

Classification of Brain MRI as Tumor/Non Tumor

1 Jun 2020

towardsdatascience.com

Image Segmentation With 5 Lines 0f Code

1 Jun 2020

towardsdatascience.com

Computer vision is evolving on a daily basis. Popular computer vision techniques such as image classification and object detection have been used extensively to solve a lot of computer vision…

Understanding Associative Embedding

19 May 2020

towardsdatascience.com

An elegant method to group predictions without labeling

Master the COCO Dataset for Semantic Image Segmentation

15 May 2020

towardsdatascience.com

Explore and manipulate the COCO image dataset for Semantic Image Segmentation with PyCoco, Tensorflow Keras Python libraries

Master the COCO Dataset for Semantic Image Segmentation

15 May 2020

towardsdatascience.com

Create a data generator and train your model on the COCO image dataset for Semantic Image Segmentation with PyCoco, Tensorflow Keras py

Sony’s first AI image sensor will make cameras everywhere smarter

14 May 2020

theverge.com

More computer in your camera

Lines Detection with Hough Transform

6 May 2020

towardsdatascience.com

An algorithm to find lines in images

A Deep Dive into Lane Detection with Hough Transform

6 May 2020

towardsdatascience.com

A detailed step-by-step guide to build a Lane Line Detection algorithm in OpenCV.

Some shirts hide you from cameras—but will anyone wear them?

17 Apr 2020

arstechnica.com

It’s theoretically possible to become invisible to cameras. But can it catch on?

Object Detection using YoloV3 and OpenCV

1 Apr 2020

towardsdatascience.com

An Introduction to Object Detection with YoloV3 for beginners

Image Data Labelling and Annotation — Everything you need to know

1 Apr 2020

towardsdatascience.com

Learn about different types of annotations, annotation formats and annotation tools

nandinib1999/object-detection-yolo-opencv: Object Detection using Yolo V3 and OpenCV

1 Apr 2020

github.com

Object Detection using Yolo V3 and OpenCV .

Computer Vision 101: Working with Color Images in Python

1 Apr 2020

towardsdatascience.com

Learn the basics of working with RGB and Lab images to boost your computer vision projects!

Disrupting Deepfakes: Adversarial Attacks on Image Translation Networks (Co

1 Apr 2020

github.com

🔥🔥Defending Against Deepfakes Using Adversarial Attacks on Conditional Image Translation Networks - natanielruiz/disrupting-deepfakes

Building an Image-Taking Interface Application for Your Image Recognition M

1 Apr 2020

towardsdatascience.com

Explore the Real-World Applications of Your Model

Brain Tumor Detection using Mask R-CNN

1 Apr 2020

kdnuggets.com

Mask R-CNN has been the new state of the art in terms of instance segmentation. Here I want to share some simple understanding of it to give you a first look and then we can move ahead and build our model.

Learning to See Transparent Objects

9 Mar 2020

ai.googleblog.com

Posted by Shreeyak Sajjan, Research Engineer, Synthesis AI and Andy Zeng, Research Scientist, Robotics at Google Optical 3D range sensors, like R...

Self Supervised Depth Estimation: Breaking down the ideas

9 Mar 2020

towardsdatascience.com

Learning depth without manual annotation

Using Pytesseract to Convert Images into a HTML Site

9 Mar 2020

armaizadenwala.com

Convert images to a string with Google Tesseract and then into a static HTML site using python

Introduction to Histogram Equalization

9 Mar 2020

allaboutcircuits.com

How can digital signal processing help you equalize histograms for digital photography? Learn more here.

Dive Really Deep into YOLO v3: A Beginner’s Guide

19 Feb 2020

reddit.com

443K subscribers in the learnmachinelearning community. A subreddit dedicated to learning machine learning

kornia/kornia: Open Source Differentiable Computer Vision Library for PyTor

19 Feb 2020

github.com

Geometric Computer Vision Library for Spatial AI.

Table Detection and Extraction Using Deep Learning

19 Feb 2020

nanonets.com

Extract table from image with Nanonets table detection OCR. Learn OCR table Deep Learning methods to detect tables in images or PDF documents.

Powerful computer vision algorithms are now small enough to run on your phone

24 Nov 2019

technologyreview.com

Researchers have shrunk state-of-the-art computer vision models to run on low-power devices. Growing pains: Visual recognition is deep learning’s strongest skill. Computer vision algorithms are analyzing medical images, enabling self-driving cars, and powering face recognition. But training models to recognize actions in videos has grown increasingly expensive. This has fueled concerns about the technology’s carbon…

Keras Mask R-CNN - PyImageSearch

30 Aug 2019

pyimagesearch.com

In this tutorial you will learn how to use Keras, Mask R-CNN, and Deep Learning for instance segmentation (both with and without a GPU).

Computer Vision for Beginners: Part 4

29 Aug 2019

medium.com

Contour detection and having a little bit of fun

YOLO: Real-Time Object Detection

29 Aug 2019

pjreddie.com

You only look once (YOLO) is a state-of-the-art, real-time object detection system.

Computer Vision for Beginners: Part 1

23 Aug 2019

kdnuggets.com

Image processing is performing some operations on images to get an intended manipulation. Think about what we do when we start a new data analysis. We do some data preprocessing and feature engineering. It’s the same with image processing.

The Hitchhiker’s Guide to Feature Extraction

20 Aug 2019

mlwhiz.com

Some Tricks and Code for Kaggle and Everyday work. This post is about useful feature engineering methods and tricks that I have learned and end up using often.

LouieYang/deep-photo-styletransfer-tf: Tensorflow (Python API) implementati

8 Jun 2018

github.com

Tensorflow (Python API) implementation of Deep Photo Style Transfer - LouieYang/deep-photo-styletransfer-tf

Learning to write programs that generate images

8 Jun 2018

deepmind.com

Through a human’s eyes, the world is much more than just the images reflected in our corneas. For example, when we look at a building and admire the intricacies of its design, we can appreciate...