Skip to main content
  1. Data Science Blog/

Computer Vision Research Work

·5225 words·25 mins· loading · ·
Computer Vision (CV) Research & Academia AI/ML Research & Evaluation Computer Vision Research Methods Artificial Intelligence (AI)
Share with :

Computer Vision Research Work"

Computer Vision Research Work
#

When we talk about “vision” capabilities, most people don’t understand how complex the brain is in processing the visual spectrum (light signals). What kind of processing happens inside our brain that allows us to understand color, depth, motion, speed, segments, objects, scenes, different kinds of art, drawings, culture, etc.? Until recently, when “computer vision” became a serious field in AI, only neurology researchers, surgeons, and brain specialists had some insights into these processes. But since 2012 (AlexNet Paper), with new papers being published almost every month, we are constantly learning how far we’ve come in computer vision. This article is not only about the chronology of computer vision but also about software engineers, computer scientists, AI engineers, and everyone who wants to understand how their phone performs certain computer visions tasks and becomes intelligent.

SNoResearch NameShort Description of PaperMonth-YearOrganizationURL
1DeconvNetDeconvolutional Networks for Feature LearningNov 2010KAISTPaper, Blog
2Saliency PropagationA method for salient object detection that propagates saliency information through optimizationApr 2014Chinese Academy of SciencesPaper
3SDSSimultaneous Detection and SegmentationJun 2014UC BerkeleyPaper
4GoogleNetIntroduced the Inception module to increase network depth and width efficiently.Sep-2014Google
5VGGNetUsed small 3x3 convolution filters to increase depth, achieving high accuracy.Sep-2014Oxford University
6FCNFully Convolutional Networks for semantic segmentationNov 2014UC BerkeleyPaper
7HyperColumnMulti-scale CNN feature fusionNov 2014UC BerkeleyPaper
8DeepLab v1Semantic Image Segmentation with Deep Convolutional Nets and CRFsDec 2014GooglePaper
9U-NetConvolutional network for biomedical image segmentationMay 2015University of FreiburgPaper, Blog
10Highway NetworkProposed highway layers to enable training of very deep networks.May-2015University of Montreal
11YOLO SeriesYou Only Look Once: series of real-time object detection systems (v1-v4)Jun 2015 (v1) - Apr 2020 (v4)University of Washington, DarknetPaper, Blog
12CRF-RNNConditional Random Fields as Recurrent Neural NetworksJun 2015University of OxfordPaper
13MR-CNN & S-CNNMulti-Region CNN and Semantic CNN for object detectionJun 2015University of California, BerkeleyPaper
14DeepMaskLearning to Segment Objects CandidatesJun 2015Facebook AI ResearchPaper
15LAPGANLaplacian Pyramid of Generative Adversarial Networks for image generationJun 2015Facebook AI ResearchPaper
16CUDMedVision1Medical Image Segmentation System 1Sep 2015Chinese University of Hong KongPaper, Blog
17SegNetDeep Convolutional Encoder-Decoder Architecture for Image SegmentationOct 2015University of CambridgePaper
18DilatedNetMulti-Scale Context Aggregation by Dilated ConvolutionsNov 2015Princeton UniversityPaper
19CAMClass Activation Mapping for identifying discriminative regionsDec 2015MITPaper
20ParseNetLooking Wider to See Better for semantic segmentationDec 2015UNC Chapel HillPaper
21MNCInstance-aware Semantic Segmentation via Multi-task Network CascadesDec 2015Microsoft ResearchPaper
22ResNetIntroduced residual learning to address vanishing gradients in deep networks.Dec-2015Microsoft Research
23SqueezeNetAlexNet-level accuracy with 50x fewer parametersFeb 2016UC Berkeley, StanfordPaper, Blog
24SqueezeNetDesigned to reduce model size while maintaining accuracy, using 1x1 convolutions.Feb-2016DeepScale, UC Berkeley
25Pre-activation ResNetIdentity Mappings in Deep Residual NetworksMar 2016Microsoft ResearchPaper
26SharpMaskLearning to Refine Object SegmentsMar 2016Facebook AI ResearchPaper
27InstanceFCNInstance-sensitive Fully Convolutional NetworksMar 2016Microsoft ResearchPaper
28MultipathNetMultiple Path Aggregation NetworkApr 2016Facebook AI ResearchPaper
29R-FCNRegion-based Fully Convolutional Networks for object detectionMay 2016Microsoft ResearchPaper
30NOCNeural Object Counting for object detectionMay 2016Microsoft ResearchPaper
31DeepLab v2Semantic Image Segmentation with Deep Convolutional Nets, Atrous ConvolutionJun 2016GooglePaper
32DeepSimDeep Learning Approach for Image Quality AssessmentJun 2016Tsinghua UniversityPaper
33DISDeep Image SmoothingJun 2016University of IllinoisPaper
34V-NetFully Convolutional Neural Network for volumetric medical image segmentationJun 2016University College LondonPaper
353D U-netVolumetric Segmentation with 3D U-netJun 2016University of FreiburgPaper
36ENetEfficient Neural Network for Real-time Semantic SegmentationJul 2016University of CambridgePaper
37ResNet38Wider or Deeper: Revisiting the ResNet ModelJul 2016KAISTPaper
38DRRNDeep Recursive Residual Network for image super-resolutionJul 2016National University of SingaporePaper
39Multi-ChannelMulti-Channel CNN for medical image analysisJul 2016University of California, San DiegoPaper
40GCNGraph Convolutional Networks for processing graph-structured dataSep 2016University of MontrealPaper
41M²FCNMulti-modal Fully Convolutional Networks for medical imagingSep 2016Chinese Academy of SciencesPaper, Blog
42Graph CNNGraph Convolutional Neural NetworksSep 2016University of MontrealPaper
43Grad-CAMGradient-weighted Class Activation MappingOct 2016Georgia TechPaper
44ResNeXtAggregated Residual Transformations for Deep Neural NetworksNov 2016Facebook AI ResearchPaper
45DRNDilated Residual NetworksNov 2016Princeton UniversityPaper
46RefineNetMulti-Path Refinement Networks for high-resolution semantic segmentationNov 2016University of AdelaidePaper
47FractalNetUltra-Deep Neural Networks without ResidualsNov 2016University of TorontoPaper
48SSDSingle Shot MultiBox Detector for real-time object detectionDec 2016GooglePaper, Blog
49TDMTop-Down Modulation for object detectionDec 2016Carnegie Mellon UniversityPaper
50FPNFeature Pyramid Networks for object detectionDec 2016Facebook AI ResearchPaper
51VoxResNetDeep Voxelwise Residual NetworksDec 2016Chinese Academy of SciencesPaper
52DSSDDeconvolutional Single Shot DetectorJan 2017UNC Chapel HillPaper
53PolyNetBetter Vision with More Complex PathsMar 2017Microsoft ResearchPaper
54IGCNetInterleaved Group ConvolutionsMar 2017Microsoft ResearchPaper
55DCNDeformable Convolutional NetworksMar 2017Microsoft Research AsiaPaper
56IDW-CNNImage Dependent Warping CNNMar 2017Seoul National UniversityPaper
57FCISFully Convolutional Instance-aware Semantic SegmentationMar 2017Microsoft Research AsiaPaper
58Residual Attention NetworkAttention mechanism for image classificationApr 2017Tsinghua UniversityPaper
59ResNet-DUC-HDCDense Upsampling Convolution and Hybrid Dilated ConvolutionApr 2017Tsinghua UniversityPaper
60MobileNetFocused on efficient models for mobile and embedded devices using depthwise separable convolutions.Apr-2017Google
61G-RMIGoogle’s large scale object detection systemJun 2017Google ResearchPaper
62GraphSAGEInductive Representation Learning on Large GraphsJun 2017Stanford UniversityPaper
63DPNDual Path Networks combining ResNet and DenseNetJul 2017UCSD, MomentaPaper
64ERFNetEfficient Residual Factorized ConvNet for real-time semantic segmentationJul 2017Universidad de AlcaláPaper
65Suggestive AnnotationActive Learning for medical image segmentationJul 2017ETH ZurichPaper
66RetinaNetFocal Loss for Dense Object DetectionAug 2017Facebook AI ResearchPaper, Blog
67Hide-and-SeekWeakly-supervised object detection training strategyAug 2017Carnegie Mellon UniversityPaper
68C3Cross-City Cascade for semantic segmentationAug 2017University of OxfordPaper
69U-net+Res-netCombined U-net and Residual Network for medical segmentationAug 2017Technical University of MunichPaper
70DenseVoxNetDense Voxel Network for 3D medical image segmentationSep 2017Chinese University of Hong KongPaper
71Graph Attention NetworksSelf-attention for Graph DataOct 2017Université de MontréalPaper
72Light-Head R-CNNLight-weight object detection architectureNov 2017Megvii TechnologyPaper
73LayerCascadeInstance segmentation via layer cascadeNov 2017University of WashingtonPaper
743D U-net + ResNetCombined 3D U-net and ResNet for volumetric segmentationNov 2017Technical University of MunichPaper
75Cascade R-CNNMulti-stage object detection refinementDec 2017CIDSEPaper
76StairNetTop-down semantic feature refinementDec 2017Seoul National UniversityPaper
77MaskLabInstance Segmentation by Refining Object DetectionJan 2018GooglePaper
78RU-Net + R2U-NetRecurrent Residual U-Net variantsJan 2018University of DhakaPaper
79AmoebaNetEvolutionary Architecture SearchFeb 2018Google BrainPaper
80SqueezeNextHardware-Aware Neural Network DesignFeb 2018UC BerkeleyPaper
81ENASEfficient Neural Architecture SearchFeb 2018Google BrainPaper
82DeepLab v3+Encoder-Decoder with Atrous Separable ConvolutionFeb 2018GooglePaper, Blog
83Group NormalizationAlternative to Batch NormalizationMar 2018Facebook AI ResearchPaper
84ACoLAdversarial Complementary Learning for weakly supervised object localizationMar 2018University of Technology SydneyPaper
85BR²NetBoundary Refinement and Recurrent Network for semantic segmentationMar 2018Tsinghua UniversityPaper
86PANetPath Aggregation Network for Instance SegmentationMar 2018Chinese Academy of SciencesPaper
87MorphNetFast & Simple Resource-Constrained Structure LearningApr 2018Google ResearchPaper
88ImageNet RethinkingResearch on ImageNet training strategiesApr 2018Facebook AI ResearchPaper
89Attention U-netAttention Gates for Medical Image SegmentationApr 2018University College LondonPaper
90MegNetMulti-Evidence Guidance for weakly supervised object detectionJun 2018University of Technology SydneyPaper
91H-DenseUNetHybrid Densely Connected UNet for medical segmentationJun 2018Chinese University of Hong KongPaper
92PNASNetProgressive Neural Architecture SearchJul 2018Google BrainPaper
93ShuffleNetV2Practical Guidelines for Mobile Network DesignJul 2018Face++Paper
94BAMBottleneck Attention ModuleJul 2018KAISTPaper
95CBAMConvolutional Block Attention ModuleJul 2018KAISTPaper
96NetAdaptPlatform-Aware Neural Network AdaptationJul 2018MITPaper
97U-Net++Nested U-Net ArchitectureJul 2018Arizona State UniversityPaper
98DU-NetDeformable U-Net for medical image segmentationAug 2018Shanghai Jiao Tong UniversityPaper
99DropBlockStructured dropout method for convolutional networksOct 2018Google BrainPaper
100AutoDeepLabNeural Architecture Search for Semantic Image SegmentationJan 2019Google ResearchPaper
101ESPNetv2Efficient Spatial Pyramid of Dilated ConvolutionsMar 2019MITPaper
102SiamRPN++Deep learning-based visual tracking framework that removes spatial awareness by sampling features across different layersMar 2019Chinese Academy of SciencesPaper
103Libra R-CNNBalanced learning framework for object detection that addresses sample level, feature level, and objective level imbalanceApr 2019SenseTime ResearchPaper
104FBNetHardware-Aware Efficient ConvNet DesignMay 2019Facebook AI ResearchPaper
105SDNSelective Deep Network for efficient visual recognitionMay 2019University of TexasPaper
106MultiResUNetMulti-Resolution U-Net for medical image segmentationMay 2019Bangladesh UniversityPaper
107EfficientNetScaled networks uniformly in depth, width, and resolution for better efficiency.May-2019Google
108ADLAttention-based Dropout Layer for weakly supervised object localizationJun 2019KAISTPaper
109ARMA ConvolutionAuto-Regressive Moving Average Graph FilteringJun 2019Università degli Studi di ModenaPaper
110Panoptic SegmentationUnified Scene Parsing FrameworkJun 2019Facebook AI ResearchPaper
111CutMixData augmentation method combining cut and mix imagesAug 2019Clova AI Research, NAVERPaper
112SlowFastTwo-pathway network for video recognition that captures both slow and fast motion patternsAug 2019Facebook AI ResearchPaper
113EfficientDetScalable object detection architecture using weighted bidirectional feature network and compound scalingNov 2019Google ResearchPaper
114AdderNetNeural Networks with Only Addition OperationsDec 2019Huawei Noah’s Ark LabPaper
115TPNTemporal Pyramid Network for action detection in videosDec 2019Microsoft Research AsiaPaper
116ATSSAdaptive Training Sample Selection for object detectionDec 2019ByteDance AI LabPaper
117ACNeAttentive Context Normalization for robust permutation-equivariant learningDec 2019KAISTPaper
118Cascade Cost VolumeCascade Cost Volume for stereo matchingDec 2019Megvii TechnologyPaper
119Yolact++Real-time instance segmentation with improved mask quality and inference speedJan 2020University of California, DavisPaper
120MCNMulti-task Collaboration NetworkJan 2020Microsoft Research AsiaPaper
121RandLA-NetLarge-scale Point Cloud Semantic SegmentationJan 2020University of OxfordPaper
122OccuSeg3D instance segmentation approach that handles occlusions in point cloudsMar 2020Stanford UniversityPaper
123GTADGlobal Temporal Action Detection framework for temporal action localizationMar 2020Sun Yat-sen UniversityPaper
124Attention-RPNVisual tracking framework with attention mechanism in Region Proposal NetworkMar 2020Chinese Academy of SciencesPaper
125QSA + QNTQuantized Squeeze-and-Attention NetworksMar 2020Tsinghua UniversityPaper
126UNet 3+Full-Scale Connected UNet for medical image segmentationMar 2020Southern Medical UniversityPaper
127ROAMRecurrently Optimizing Tracking ModelMar 2020ByteDance AI LabPaper
128PF-NETPoint Fractal Network for 3D point cloud completionMar 2020Simon Fraser UniversityPaper
129Total3DUnderstanding3D Scene UnderstandingMar 2020National University of SingaporePaper
130SG-NNScene Graph Neural NetworksMar 2020Georgia TechPaper
131SEANSemantic Region-Adaptive NormalizationMar 2020ETH ZürichPaper
132SAOLSelf-Attention Object LocalizationApr 2020Seoul National UniversityPaper
133VGGNet For Covid19Modified VGG architecture for COVID-19 detectionApr 2020Multiple InstitutionsPaper
134CentripetalNetAnchor-free object detection with point-based predictionApr 2020Megvii TechnologyPaper
135PointAugmentAuto-Augmentation for 3D Point CloudApr 2020National University of SingaporePaper
136PQ-NetLearning to Generate 3D ShapesApr 2020Stanford UniversityPaper
137Axial-DeepLabStand-Alone Axial-Attention for Vision ModelsApr 2020Johns Hopkins UniversityPaper
138SipMaskSpatial Information Preservation for Fast Instance SegmentationApr 2020Inception Institute of AIPaper
139SCANLearning to Classify Images without LabelsApr 2020Facebook AI ResearchPaper
140MutualNetAdaptive ConvNet via Mutual LearningApr 2020Microsoft Research AsiaPaper
141DETREnd-to-End Object Detection with TransformersMay 2020Facebook AI ResearchPaper
142C-FlowConditional Normalizing FlowsMay 2020ETH ZürichPaper
143PerfectShapeShape completion using implicit functionsMay 2020Stanford UniversityPaper
144UFO²Unified Framework for Object DetectionMay 2020Carnegie Mellon UniversityPaper
145Refinement NetworkRGB-D Scene UnderstandingMay 2020Technical University MunichPaper
146AssembleNet++Video Recognition with Learnable ConnectivityMay 2020Google ResearchPaper
147WeightNetRevisiting Weight NetworksMay 2020Microsoft ResearchPaper
148YOLOv5Improved version of YOLO with better speed-accuracy trade-offJun 2020UltralyticsPaper
149UCTGANUnsupervised Cartoon-to-Real Translation GAN for image translation between cartoon and real-world domainsJun 2020Nanyang Technological UniversityPaper
150IF-NetsImplicit Function Neural Networks for 3D reconstructionJun 2020Max Planck InstitutePaper
151SketchGCNSketch Recognition using Graph Convolutional NetworksJun 2020University of British ColumbiaPaper
152AABOAdaptive Anchor Box OptimizationJun 2020Huawei Noah’s Ark LabPaper
153Polka LinesLine Detection using Polar CoordinatesJun 2020Korea UniversityPaper
154Pose2Mesh3D Human Pose and Mesh RecoveryJun 2020Korea UniversityPaper
155SNE-RoadSegRoad Segmentation with Synthetic DataJun 2020Hong Kong UniversityPaper
156Deep Hough TransformLine Detection using Deep LearningJun 2020Chinese Academy of SciencesPaper
157Non-Local Sparse AttentionEfficient Attention MechanismJun 2020Google ResearchPaper
158Hit-DetectorHierarchical Trinity architecture for object detection combining different detection paradigmsJul 2020ByteDance AI LabPaper
159Spectral 3D Computer VisionGraph Neural Network LibraryJul 2020Multiple ContributorsPaper
160TIDEError Analysis Tool for Object DetectionJul 2020Carnegie Mellon UniversityPaper
161SimAugLearning Robust Representations through SimulationJul 2020Carnegie Mellon UniversityPaper
162HOTREnd-to-End Human-Object Interaction DetectionJul 2020KAISTPaper
163ReXNetRethinking Channel Dimensions for Efficient Model DesignJul 2020UC BerkeleyPaper
164Keep Eyes on the LaneLane Detection with Deep LearningJul 2020Shanghai Jiao Tong UniversityPaper
165AdvPCAdversarial Point Cloud DefenseJul 2020Tsinghua UniversityPaper
166PD-GANProbabilistic Diverse GANJul 2020University of OxfordPaper
167FedDGFederated Domain GeneralizationJul 2020Carnegie Mellon UniversityPaper
168Dynamic RCNNDynamic R-CNN for object detection with improved training and inferenceAug 2020ByteDance AI LabPaper
169Aug-FPNAugmented Feature Pyramid Network for object detection with improved multi-scale feature fusionAug 2020Tsinghua UniversityPaper
170Instant-teachingSelf-training for Object DetectionAug 2020ByteDance AI LabPaper
171Soft-IntroVAESoft Introduction of Variational AutoEncodersAug 2020Tel Aviv UniversityPaper
172DiNTSDifferentiable Neural Network Transform SearchAug 2020Microsoft ResearchPaper
173Eagle EyeFast Sub-net Evaluation for Efficient Neural Network TrainingAug 2020MITPaper
174StyleMapGANExploiting Spatial Dimensions of Latent for Image ManipulationAug 2020KAISTPaper
175TediGANText-Guided Diverse Image GenerationAug 2020Microsoft Research AsiaPaper
176Auto-Exposure FusionAutomatic Exposure Fusion for PhotographyAug 2020ETH ZürichPaper
177Vision TransformerTransformer architecture adapted for image recognition tasksSep 2020Google ResearchPaper
178IDUInstance Depth Embedding for RGB-D salient object detectionSep 2020Nankai UniversityPaper
179VideoMoCoContrastive Learning for Video UnderstandingSep 2020Microsoft Research AsiaPaper
180MZSRMeta-Transfer Learning for Zero-Shot Super-ResolutionNov 2020KAISTPaper
181DeiTData-efficient training of image transformersDec 2020Facebook AI ResearchPaper
182InvolutionInverting Convolution for Visual RecognitionDec 2020Shanghai AI LabPaper
183Deep Learning on Semantic SegmentationComprehensive Survey and BenchmarkDec 2020Chinese Academy of SciencesPaper
184LiteFlowNet3Lightweight Optical Flow EstimationDec 2020Chinese University of Hong KongPaper
185PPDMParallel Point Detection and MatchingDec 2020ByteDance AI LabPaper
186RepVGGMaking VGG-style ConvNets Great AgainJan 2021MEGVII TechnologyPaper
187PSConvolutionParameter-Sharing Convolution for Deep LearningJan 2021Tsinghua UniversityPaper
188PerPixel ClassificationPixel-wise Classification NetworkJan 2021ETH ZürichPaper
189PIPALPerceptual Image Quality AssessmentJan 2021Nanyang Technological UniversityPaper
190ArtGANArtwork Synthesis with GANFeb 2021NVIDIA ResearchPaper
191Synthetic to RealDomain Adaptation for Semantic SegmentationFeb 2021ETH ZürichPaper
192Spatial-Phase-Shallow-LearningPhase-Based Feature LearningFeb 2021Peking UniversityPaper
193DARKGANDark Image Enhancement with GANFeb 2021Tsinghua UniversityPaper
194Deep Imbalance RegressionLearning from Imbalanced DataFeb 2021Carnegie Mellon UniversityPaper
195Room Classification GNNGraph Neural Network for Room LayoutFeb 2021Facebook ResearchPaper
196Pyramid Vision TransformerHierarchical Vision TransformerFeb 2021KAISTPaper
197Residual AttentionAttention Mechanism for CNNsFeb 2021Google ResearchPaper
198Teachers do more than teachMulti-teacher approach for image-to-image translationMar 2021Tel Aviv UniversityPaper
199Vip-DeepLabVisual Parsing DeepLab for Panoptic SegmentationMar 2021Google ResearchPaper
200HistoGANHistological Image Generation with GANMar 2021University of OxfordPaper
201Anchor-Free Person SearchEnd-to-End Person Search without AnchorsMar 2021Chinese Academy of SciencesPaper
202CBNetV2Composite Backbone NetworkMar 2021Megvii TechnologyPaper
203Kaleido-BERTVision-Language Pre-trainingMar 2021Microsoft Research AsiaPaper
204Elastic Graph Neural NetworkAdaptive Graph Structure LearningMar 2021Stanford UniversityPaper
205Rank and Sort LossLoss Function for Object DetectionMar 2021ByteDance AI LabPaper
206EigenGANEigenvalue-Based GAN ArchitectureMar 2021MITPaper
207DetCoUnsupervised Detection Pre-trainingMar 2021Microsoft Research AsiaPaper
208MG-GANMulti-Generator GANMar 2021NVIDIA ResearchPaper
209AdaAttNAdaptive Attention for Style TransferMar 2021Microsoft Research AsiaPaper
210AirBERTVision-Language Model for Aerial ImagesMar 2021Chinese Academy of SciencesPaper
211DeepGCNsDeep Graph Convolutional NetworksMar 2021KAUSTPaper
212Survey: Instance SegmentationComprehensive review of instance segmentation methodsMar 2021Multiple InstitutionsPaper
213LoFTRLocal Feature TRansformer for establishing dense correspondences between imagesApr 2021Zhejiang UniversityPaper
214Semantic Image MattingMatting with Semantic GuidanceApr 2021ByteDance AI LabPaper
215EfficientNetV2Improved EfficientNet ArchitectureApr 2021Google ResearchPaper
216Closed-Loop MattersDual Regression for Image GenerationApr 2021University of OxfordPaper
217Mobile-FormerMobile-Friendly TransformerApr 2021Microsoft ResearchPaper
218GNeRFGeneralizable Neural Radiance FieldsApr 2021UC BerkeleyPaper
219DETR with Modulated Co-AttentionEnhanced DETR ArchitectureApr 2021Facebook AI ResearchPaper
220Adaptable GAN EncodersFlexible GAN InversionApr 2021Adobe ResearchPaper
221ConformerLocal Features Meet Global DependenciesApr 2021Shanghai AI LabPaper
222VMNetVisual Manipulation NetworksApr 2021Stanford UniversityPaper
223Battle of Network StructureNetwork Architecture Comparison StudyApr 2021Google ResearchPaper
224Efficient Person SearchFast Person Search FrameworkApr 2021University of Technology SydneyPaper
225SLIDESmart Learning on Large-Scale DataApr 2021Carnegie Mellon UniversityPaper
226SOTRTransformer for Set OperationsApr 2021Tsinghua UniversityPaper
227CANetClass-Agnostic Segmentation NetworksApr 2021UC BerkeleyPaper
228YOLOPReal-time Driving PerceptionMay 2021Huawei Noah’s Ark LabPaper
229InSeGANInteractive Segmentation with GANMay 2021Adobe ResearchPaper
230GroupFormerGroup-Based AttentionMay 2021Microsoft ResearchPaper
231Super NeuronNeural Architecture EnhancementMay 2021MITPaper
232SO-PoseSelf-Occlusion Aware Pose EstimationMay 2021NVIDIA ResearchPaper
233TxTText-driven Text GenerationMay 2021Google ResearchPaper
234OS2DOne-Stage 2D Object DetectionMay 2021Yandex ResearchPaper
235CodeNetLarge-Scale Code DatasetMay 2021IBM ResearchPaper
236Geometric Deep LearningBlueprint for designing architectures for geometric dataMay 2021Imperial College LondonPaper
237Oriented R-CNNOriented Object DetectionJun 2021Tongji UniversityPaper
238XVFIVideo Frame InterpolationJun 2021KAISTPaper
239Cross Domain Contrastive LearningDomain Adaptation via Contrastive LearningJun 2021Microsoft ResearchPaper
240PointManifoldCutData Augmentation for Point CloudsJun 2021Stanford UniversityPaper
241Distance IOU LossImproved Loss Function for Object DetectionJun 2021Tsinghua UniversityPaper
242ConvMLPConvolutional MLP ArchitectureJul 2021University of OregonPaper
243Graph-FPNFeature Pyramid Networks with Graph Neural NetworksJul 2021Carnegie Mellon UniversityPaper
244WatchOut!Motion Blur Impact on DNNsJul 2021ETH ZürichPaper
245ECA-NetEfficient Channel Attention NetworkJul 2021Tsinghua UniversityPaper
246ShiftAddNetEfficient Neural Network TrainingAug 2021MITPaper
247Deep Imitation LearningSurvey of Imitation Learning MethodsAug 2021DeepMindPaper
2483DETR3D Object Detection with TransformersAug 2021Facebook AI ResearchPaper
249ByteTrackMulti-Object Tracking FrameworkAug 2021ByteDance AI LabPaper
250Neuron MergingNetwork Compression via Neuron MergingSep 2021Microsoft ResearchPaper
251Focal TransformerVision Transformer with Focal AttentionSep 2021Microsoft ResearchPaper
252Non-Deep NetworksAlternative to Deep Neural NetworksSep 2021MITPaper
253PytorchVideoDeep Learning Library for Video UnderstandingSep 2021Facebook AI ResearchPaper
254HeadGANHead Generation and EditingOct 2021Tel Aviv UniversityPaper
255StyleGAN3Alias-Free Generative NetworkOct 2021NVIDIA ResearchPaper
256MedMNISTMedical Image Dataset CollectionOct 2021Stanford UniversityPaper
257TokenLearnerDynamic Token Selection in Vision TransformersOct 2021Google ResearchPaper
258Temporal Fusion TransformerMulti-horizon ForecastingOct 2021Google ResearchPaper
259NeuralProphetNeural Network based Time-Series ModelOct 2021Stanford UniversityPaper
260MetNet-2Weather Forecasting ModelOct 2021Google ResearchPaper
261Plan-then-generateControlled Text GenerationNov 2021Microsoft ResearchPaper
262ProjectedGANImproved GAN Image QualityNov 2021NVIDIA ResearchPaper
263PHALPPose and Human Analysis using Language ProcessingNov 2021Carnegie Mellon UniversityPaper
264Semantic Diffusion GuidanceControlled Image GenerationNov 2021Stanford UniversityPaper
265GauGANText-to-Image GenerationNov 2021NVIDIA ResearchPaper
266NeatNetNeural Architecture EvolutionNov 2021Google ResearchPaper
267DenseULearnDense Prediction with UncertaintyNov 2021ETH ZürichPaper
268StyleNeRFNeural Radiance Fields with Style-based GenerationDec 2021NVIDIA ResearchPaper
269Colossal-AILarge-Scale Parallel Training SystemDec 2021UC BerkeleyPaper
270EditGANSemantic Image Editing with GANsDec 2021Adobe ResearchPaper
271PoolFormerAlternative to Attention-based TransformersDec 2021Sea AI LabPaper
272GLIPGrounded Language-Image Pre-trainingDec 2021Microsoft ResearchPaper
273PixMixData Augmentation StrategyDec 2021Google ResearchPaper
274GANgealingGAN-based Image AlignmentDec 2021MITPaper
275HiClassHierarchical Classification MetricsDec 2021Microsoft ResearchPaper
276MetaFormerGeneral Architecture for VisionDec 2021Sea AI LabPaper
277SAViSlot Attention for Video UnderstandingDec 2021DeepMindPaper
278PARPParameter Reduction TechniqueDec 2021MITPaper
279TransMixData Augmentation for TransformersDec 2021Microsoft ResearchPaper
280Stable Long Term Video SRLong-term Video Super ResolutionDec 2021ETH ZürichPaper
281Few-Shot LearnerFew-Shot Learning FrameworkDec 2021Meta AI ResearchPaper
282StyleSwinStyleGAN with Swin TransformerDec 2021Microsoft ResearchPaper
2832 Stage U-netTwo-Stage Medical Image SegmentationDec 2021Stanford UniversityPaper
284ELSAEfficient Long-term Semantic AggregationDec 2021ETH ZürichPaper
285GLIDEText-Guided Image GenerationDec 2021OpenAIPaper
286AdaViTAdaptive Vision TransformersJan 2022Microsoft ResearchPaper
287Exemplar TransformersExample-based Vision TransformersJan 2022Google ResearchPaper
288RepMLNetReprogrammable Multi-Layer NetworkJan 2022Tsinghua UniversityPaper
289Untrained Deep NNDeep Networks without TrainingJan 2022MITPaper
290JoJoGANJust one Joint Training GANJan 2022National University of SingaporePaper
291PRIMEPre-trained Image EncodersJan 2022Google ResearchPaper
292StyleGAN-VVideo Generation with StyleGANJan 2022NVIDIA ResearchPaper
293SmoothNetMotion Smoothing NetworkJan 2022ETH ZürichPaper
294PCACEPoint Cloud Auto-EncoderJan 2022Tsinghua UniversityPaper
295Siamese CDChange Detection with TransformersJan 2022Wuhan UniversityPaper
296SASASelf-Attention Spatial AdaptivityJan 2022Carnegie Mellon UniversityPaper
297GCDGeneralized Category DiscoveryJan 2022University of OxfordPaper
2983D ConvNet OptimizationOptimization Planning for 3D CNNsJan 2022Google ResearchPaper
299SeamlessGANSeamless Image GenerationJan 2022Adobe ResearchPaper
300HardBoostHard Example Mining with BoostingJan 2022Tsinghua UniversityPaper
301Q-ViTQuantized Vision TransformerJan 2022Meta AI ResearchPaper
302GeoFillGeometry-aware Image InpaintingJan 2022Adobe ResearchPaper
303DeticDetector with Image ClassesJan 2022UC BerkeleyPaper
304RelTRRelational TransformerJan 2022Microsoft ResearchPaper
305ResiDualGANResidual Dual GAN ArchitectureJan 2022NVIDIA ResearchPaper
306You Only Cut OnceSingle-Shot Instance SegmentationJan 2022ByteDance AI LabPaper
307KFIoU LossKalman Filter IoU Loss FunctionJan 2022Tongji UniversityPaper
308StyleGAN3 EditingImage and Video Editing FrameworkJan 2022NVIDIA ResearchPaper
309Block-NeRFCity-scale Neural Radiance Fields using blocked-based decompositionJan 2022Waymo/Google ResearchPaper
310SeMaskSemantically Masked TransformersFeb 2022NVIDIA ResearchPaper
311SLIPSelf-supervision with Language-Image Pre-trainingFeb 2022UC BerkeleyPaper
312Deformable ViTVision Transformer with Deformable AttentionFeb 2022Microsoft ResearchPaper
313Lawin TransformerLightweight Transformer for SegmentationFeb 2022Nanjing UniversityPaper
314HyperionSolarNetSolar Panel Detection NetworkFeb 2022Stanford UniversityPaper
315KerGNNsKernel Graph Neural NetworksFeb 2022MITPaper
316gDNAGeometric DNA NetworksFeb 2022DeepMindPaper
317HYDRAHybrid Deep Learning ArchitectureFeb 2022Microsoft ResearchPaper
318DDU-NetDense Dual-Path U-NetFeb 2022Shanghai Jiao Tong UniversityPaper
319SPAMsSpatial Attention ModulesFeb 2022Google ResearchPaper
320ReLICv2Representation Learning with Image ConsistencyFeb 2022Meta AI ResearchPaper
321Momentum CapsulesDynamic Routing with MomentumFeb 2022Google ResearchPaper
322SAR DespeckingTransformer for SAR Image DenoisingFeb 2022Chinese Academy of SciencesPaper
323VRTVideo Restoration TransformerFeb 2022ETH ZürichPaper
324StyleGAN-XLExtra Large Scale StyleGANFeb 2022NVIDIA ResearchPaper
325AlphaCodeCode Generation AI SystemFeb 2022DeepMindPaper
326StyleGAN-HumanHuman image synthesis using StyleGANApr 2022Microsoft ResearchPaper
327How Do Vision Transformers Work?Analysis of internal mechanisms of Vision TransformersJun 2022Google ResearchPaper
328FERV39kFacial Expression Recognition Dataset with 39k samplesJun 2022South China University of TechnologyPaper
329DaViTData-efficient Vision TransformerJul 2022Microsoft ResearchPaper
330BEVFormerBird’s Eye View Transformer for autonomous drivingAug 2022Shanghai AI LabPaper
331TensoRFTensorial Radiance Fields for efficient 3D reconstructionSep 2022Zhejiang UniversityPaper
332WebFace260MLarge-scale face recognition datasetSep 2022InsightFacePaper
333Neighborhood Attention TransformerLocal attention mechanism for vision tasksOct 2022Meta AI ResearchPaper
334BarbershopHair editing and synthesis frameworkOct 2022Adobe ResearchPaper
335Visual Attention NetworkNovel attention mechanism for computer visionNov 2022Meta AI ResearchPaper
336MaskGITMasked Generative Image TransformerNov 2022Google ResearchPaper
337CenterNet++Improved CenterNet for object detectionNov 2022University of TexasPaper
338Patch-NetVLAD+Enhanced visual place recognition using patch-based featuresDec 2022Oxford UniversityPaper
339PENCILProbabilistic end-to-end noise correctionDec 2022NTU SingaporePaper
340CenterSnapCenter-based 3D object pose estimationDec 2022Intel LabsPaper
341AGCNAdaptive Graph Convolutional NetworkDec 2022Tsinghua UniversityPaper
342AutoAvatarAutomated avatar generation from imagesDec 2022Tencent AI LabPaper
343Balanced MSEBalanced Mean Squared Error for imbalanced dataDec 2022Carnegie Mellon UniversityPaper
344ReCLIPImproved CLIP with region-based featuresDec 2022Google ResearchPaper
345EditGANGAN-based image editing frameworkDec 2022NVIDIA ResearchPaper
346HuMManHuman Motion and Manipulation datasetDec 2022Max Planck InstitutePaper
347BlobGANUnsupervised part-aware image generationDec 2022MITPaper
348Deep Spectral MethodsSpectral analysis for deep learningDec 2022MITPaper
349TransformNetTransformer-based architecture for geometry transformationJan 2023Carnegie Mellon UniversityPaper
350Mirror-YOLOYOLO variant using mirror augmentation for detectionJan 2023Peking UniversityPaper
351Paying U-Attention to TexturesU-Net based texture synthesis with attentionJan 2023Adobe ResearchPaper
352ZippyPointFast point cloud processing architectureJan 2023ETH ZürichPaper
353InsetGAN for Full-Body Image GenerationGAN-based full-body image synthesisJan 2023Max Planck InstitutePaper
354Mixed Differential PrivacyPrivacy-preserving vision model trainingJan 2023MITPaper
355L³U-NetLightweight U-Net variant with enhanced learningJan 2023ETH ZürichPaper
356RBGNetResidual Bidirectional Graph NetworkJan 2023Peking UniversityPaper
357TopFormerTop-down Transformer for vision tasksJan 2023Microsoft ResearchPaper
358CLIP-GENCLIP-guided image generationJan 2023OpenAIPaper
359DANBODynamic Attention Network for Body PoseJan 2023Carnegie Mellon UniversityPaper
360KeypointNeRFNeRF with keypoint conditioningJan 2023Stanford UniversityPaper
361VOS (Visual Object Streaming)Efficient streaming framework for video object segmentationFeb 2023ETH ZürichPaper
362ScoreNetScore-based generative modeling for point cloud generationFeb 2023UC BerkeleyPaper
363GroupViTVision Transformer with dynamic grouping mechanismFeb 2023NVIDIA ResearchPaper
364TCTrackTemporal context-aware tracking frameworkFeb 2023Chinese Academy of SciencesPaper
365MLSegMulti-level semantic segmentation frameworkFeb 2023Stanford UniversityPaper
366StyleBabelText-guided style transfer using BABEL embeddingsFeb 2023NVIDIA ResearchPaper
367Mixed DualStyleGANDual-domain style transfer with mixed trainingFeb 2023NVIDIAPaper
368StyleT2IStyle-based text-to-image generationFeb 2023Microsoft ResearchPaper
369SPActSpatial-temporal action recognitionFeb 2023University of OxfordPaper
370JIFFJoint Image and Feature FusionFeb 2023Stanford UniversityPaper
371C3-STISRCross-Camera Stereo Image Super-ResolutionFeb 2023Tsinghua UniversityPaper
372IVYIntegrated Vision SystemFeb 2023Intel ResearchPaper
373StyLandGANStylized landscape generationFeb 2023NVIDIA ResearchPaper
374NeuralFusionNeural fusion for 3D reconstruction using implicit representationsMar 2023MITPaper
375COLAContrastive learning approach for visual recognitionMar 2023Stanford UniversityPaper
376VLP (Vision-Language Pre-training)Joint pre-training for vision and language tasksMar 2023Microsoft ResearchPaper
377Level-K to Nash EquilibriumGame theoretic approach to vision problemsMar 2023DeepMindPaper
378HyperTransformerHypernetwork-based transformer for vision tasksMar 2023Google ResearchPaper
379GrainSpaceGranular spatial representation learningMar 2023Carnegie Mellon UniversityPaper
380ROOD-MRIRobust out-of-distribution detection for medical imagingMar 2023MITPaper
381BambooFramework for efficient neural architecture searchMar 2023Microsoft ResearchPaper
382BigDetectionLarge-scale object detection frameworkMar 2023Facebook AI ResearchPaper
383TransEditorTransformer-based image editing frameworkMar 2023Adobe ResearchPaper
384Event TransformerTransformer architecture for event-based visionMar 2023Intel LabsPaper
385MVSTERMulti-view Stereo TransformerMar 2023ETH ZürichPaper
386CLIP-ArtCLIP-based artistic image synthesisMar 2023DeepMindPaper
387SequencerSequential modeling for vision tasksMar 2023Google ResearchPaper
388GraphWorldBenchmark for graph neural networksMar 2023DeepMindPaper
389F8NetLightweight network for efficient feature extractionApr 2023Tsinghua UniversityPaper
390LatentFormerTransformer architecture for latent space manipulationApr 2023MITPaper
Dr. Hari Thapliyaal's avatar

Dr. Hari Thapliyaal

Dr. Hari Thapliyal is a seasoned professional and prolific blogger with a multifaceted background that spans the realms of Data Science, Project Management, and Advait-Vedanta Philosophy. Holding a Doctorate in AI/NLP from SSBM (Geneva, Switzerland), Hari has earned Master's degrees in Computers, Business Management, Data Science, and Economics, reflecting his dedication to continuous learning and a diverse skill set. With over three decades of experience in management and leadership, Hari has proven expertise in training, consulting, and coaching within the technology sector. His extensive 16+ years in all phases of software product development are complemented by a decade-long focus on course design, training, coaching, and consulting in Project Management. In the dynamic field of Data Science, Hari stands out with more than three years of hands-on experience in software development, training course development, training, and mentoring professionals. His areas of specialization include Data Science, AI, Computer Vision, NLP, complex machine learning algorithms, statistical modeling, pattern identification, and extraction of valuable insights. Hari's professional journey showcases his diverse experience in planning and executing multiple types of projects. He excels in driving stakeholders to identify and resolve business problems, consistently delivering excellent results. Beyond the professional sphere, Hari finds solace in long meditation, often seeking secluded places or immersing himself in the embrace of nature.

Comments:

Share with :

Related

What is a Digital Twin?
·805 words·4 mins· loading
Industry Applications Technology Trends & Future Computer Vision (CV) Digital Twin Internet of Things (IoT) Manufacturing Technology Artificial Intelligence (AI) Graphics
What is a digital twin? # A digital twin is a virtual representation of a real-world entity or …
Frequencies in Time and Space: Understanding Nyquist Theorem & its Applications
·4103 words·20 mins· loading
Data Analysis & Visualization Computer Vision (CV) Mathematics Signal Processing Space Exploration Statistics
Applications of Nyquists theorem # Can the Nyquist-Shannon sampling theorem applies to light …
The Real Story of Nyquist, Shannon, and the Science of Sampling
·1146 words·6 mins· loading
Technology Trends & Future Interdisciplinary Topics Signal Processing Remove Statistics Technology Concepts
The Story of Nyquist, Shannon, and the Science of Sampling # In the early days of the 20th century, …
BitNet b1.58-2B4T: Revolutionary Binary Neural Network for Efficient AI
·2637 words·13 mins· loading
AI/ML Models Artificial Intelligence (AI) AI Hardware & Infrastructure Neural Network Architectures AI Model Optimization Language Models (LLMs) Business Concepts Data Privacy Remove
Archive Paper Link BitNet b1.58-2B4T: The Future of Efficient AI Processing # A History of 1 bit …
Ollama Setup and Running Models
·1753 words·9 mins· loading
AI and NLP Ollama Models Ollama Large Language Models Local Models Cost Effective AI Models
Ollama: Running Large Language Models Locally # The landscape of Artificial Intelligence (AI) and …