東華大學圖書館 |

Pattern recognition and computer vision = 7th Chinese Conference, PRCV 2024, Urumqi, China, October 18-20, 2024 : proceedings.. Part VII /

Record Type:	Electronic resources : Monograph/item
Title/Author:	Pattern recognition and computer vision/ edited by Zhouchen Lin ... [et al.].
Reminder of title:	7th Chinese Conference, PRCV 2024, Urumqi, China, October 18-20, 2024 : proceedings.
remainder title:	PRCV 2024
other author:	Lin, Zhouchen.
corporate name:	PRCV (Conference)
Published:	Singapore :Springer Nature Singapore : : 2025.,
Description:	xiv, 587 p. :ill. (chiefly color), digital ;24 cm.
[NT 15003449]:	Scene Text Recognition via k-NN Attention-based Decoder and Margin-based Softmax LossReal-Time Text Detection with Multi-Level Feature Fusion and Pixel ClusteringREFINED AND LOCALITY-ENHANCED FEATURE FOR HANDWRITTEN MATHEMATICAL EXPRESSION RECOGNITIONLearning Fine-grained and Semantically Aware Mamba Representations for Tampered Text Detection in ImagesDual Feature Enhanced Scene Text Recognition Method for Low-Resource UyghurSegmentation-free Todo Mongolian OCR and Its Public DatasetHybrid Encoding Method for Scene Text Recognition in Low-Resource UyghurROBC: a Radical-Level Oracle Bone Character DatasetIntegrated Recognition of Arbitrary-Oriented Multi-Line Billet NumberImproving Scene Text Recognition with Counting Aware Contrastive Learning and Attention AlignmentGridMask: An Efficient Scheme for Real Time Curved Scene Text DetectionTibetan Handwriting Recognition Method based on Structural Re-parameterization ViT and Vertical AttentionMFH: Marrying Frequency Domain with Handwritten Mathematical Expression RecognitionLeveraging Structure Knowledge and Deep Models for the Detection of Abnormal Handwritten Text -- OCR-aware Scene Graph Generation via Multi-modal Object Representation Enhancement and Logical Bias Learning -- Enhancing Transformer-based Table Structure Recognition for Long Tables -- Show Exemplars and Tell Me What You See: In-context Learning with Frozen Large Language Models for Text -- VQAMLR-NET: an arbitrary skew angle detection algorithm for complex layout document images -- TextViTCNN： Enhancing Natural Scene Text Recognition with Hybrid Transformer and Convolutional NetworksEnhancing Visual Information Extraction with Large Language Models through Layout-aware Instruction Tuning -- SFENet: Arbitrary Shapes Scene Text Detection with Semantic Feature ExtractorImproving Zero-Shot Image Captioning Efficiency with Metropolis-Hastings Sampling -- Improving Text Classification Performance through Multimodal Representation -- A Multi-feature Fusion Approach for Words Recognition of Ancient Mongolian Documents -- TableRocket: An Efficient and Effective Framework for Table Reconstruction -- Not All Texts Are the Same: Dynamically Querying Texts for Scene Text Detection -- Multi-Modal Attention based on 2D Structured Sequence for Table Recognition -- A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human Interaction Recognition -- Skeleton-Language Pre-training to Collaborate with Self-Supervised Human Action Recognition -- Spatio-Temporal Contrastive Learning for Compositional Action RecognitionPath-Guided Motion Prediction with Multi-View Scene Perception -- Privacy-preserving Action Recognition: A Survey -- Attention-based Spatio-temporal modeling with 3D Convolutional Neural Networks for Dynamic Gesture Recognition -- MIT: Multi-cue Injected Transformer for Two-stage HOI Detection -- DIDA: Dynamic Individual-to-integrated Augmentation for Self-Supervised Skeleton-Based Action Recognition -- Multi-scale Spatial and Temporal Feature Aggregation Graph Convolutional Network for Skeleton-Based Action Recognition -- Improving Video Representation of Vision-Language Model with Decoupled Explicit Temporal Modeling -- KS-FuseNet: An efficient action recognition method based on keyframe selection and feature fusion -- Dynamic Skeleton Association Transformer for dyadic Interaction Action RecognitionSpecies-Aware Guidance for Animal Action Recognition with Vision-Language Knowledge.
Contained By:	Springer Nature eBook
Subject:	Computer vision - Congresses. -
Online resource:	https://doi.org/10.1007/978-981-97-8511-7
ISBN:	9789819785117

Pattern recognition and computer vision = 7th Chinese Conference, PRCV 2024, Urumqi, China, October 18-20, 2024 : proceedings.. Part VII /
Pattern recognition and computer vision7th Chinese Conference, PRCV 2024, Urumqi, China, October 18-20, 2024 : proceedings.Part VII /[electronic resource] :PRCV 2024edited by Zhouchen Lin ... [et al.]. - Singapore :Springer Nature Singapore :2025. - xiv, 587 p. :ill. (chiefly color), digital ;24 cm. - Lecture notes in computer science,150371611-3349 ;. - Lecture notes in computer science ;15037..

Scene Text Recognition via k-NN Attention-based Decoder and Margin-based Softmax LossReal-Time Text Detection with Multi-Level Feature Fusion and Pixel ClusteringREFINED AND LOCALITY-ENHANCED FEATURE FOR HANDWRITTEN MATHEMATICAL EXPRESSION RECOGNITIONLearning Fine-grained and Semantically Aware Mamba Representations for Tampered Text Detection in ImagesDual Feature Enhanced Scene Text Recognition Method for Low-Resource UyghurSegmentation-free Todo Mongolian OCR and Its Public DatasetHybrid Encoding Method for Scene Text Recognition in Low-Resource UyghurROBC: a Radical-Level Oracle Bone Character DatasetIntegrated Recognition of Arbitrary-Oriented Multi-Line Billet NumberImproving Scene Text Recognition with Counting Aware Contrastive Learning and Attention AlignmentGridMask: An Efficient Scheme for Real Time Curved Scene Text DetectionTibetan Handwriting Recognition Method based on Structural Re-parameterization ViT and Vertical AttentionMFH: Marrying Frequency Domain with Handwritten Mathematical Expression RecognitionLeveraging Structure Knowledge and Deep Models for the Detection of Abnormal Handwritten Text -- OCR-aware Scene Graph Generation via Multi-modal Object Representation Enhancement and Logical Bias Learning -- Enhancing Transformer-based Table Structure Recognition for Long Tables -- Show Exemplars and Tell Me What You See: In-context Learning with Frozen Large Language Models for Text -- VQAMLR-NET: an arbitrary skew angle detection algorithm for complex layout document images -- TextViTCNN： Enhancing Natural Scene Text Recognition with Hybrid Transformer and Convolutional NetworksEnhancing Visual Information Extraction with Large Language Models through Layout-aware Instruction Tuning -- SFENet: Arbitrary Shapes Scene Text Detection with Semantic Feature ExtractorImproving Zero-Shot Image Captioning Efficiency with Metropolis-Hastings Sampling -- Improving Text Classification Performance through Multimodal Representation -- A Multi-feature Fusion Approach for Words Recognition of Ancient Mongolian Documents -- TableRocket: An Efficient and Effective Framework for Table Reconstruction -- Not All Texts Are the Same: Dynamically Querying Texts for Scene Text Detection -- Multi-Modal Attention based on 2D Structured Sequence for Table Recognition -- A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human Interaction Recognition -- Skeleton-Language Pre-training to Collaborate with Self-Supervised Human Action Recognition -- Spatio-Temporal Contrastive Learning for Compositional Action RecognitionPath-Guided Motion Prediction with Multi-View Scene Perception -- Privacy-preserving Action Recognition: A Survey -- Attention-based Spatio-temporal modeling with 3D Convolutional Neural Networks for Dynamic Gesture Recognition -- MIT: Multi-cue Injected Transformer for Two-stage HOI Detection -- DIDA: Dynamic Individual-to-integrated Augmentation for Self-Supervised Skeleton-Based Action Recognition -- Multi-scale Spatial and Temporal Feature Aggregation Graph Convolutional Network for Skeleton-Based Action Recognition -- Improving Video Representation of Vision-Language Model with Decoupled Explicit Temporal Modeling -- KS-FuseNet: An efficient action recognition method based on keyframe selection and feature fusion -- Dynamic Skeleton Association Transformer for dyadic Interaction Action RecognitionSpecies-Aware Guidance for Animal Action Recognition with Vision-Language Knowledge.

This 15-volume set LNCS 15031-15045 constitutes the refereed proceedings of the 7th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2024, held in Urumqi, China, during October 18-20, 2024. The 579 full papers presented were carefully reviewed and selected from 1526 submissions. The papers cover various topics in the broad areas of pattern recognition and computer vision, including machine learning, pattern classification and cluster analysis, neural network and deep learning, low-level vision and image processing, object detection and recognition, 3D vision and reconstruction, action recognition, video analysis and understanding, document analysis and recognition, biometrics, medical image analysis, and various applications.

ISBN: 9789819785117

Standard No.: 10.1007/978-981-97-8511-7doiSubjects--Topical Terms:

570734
Computer vision
--Congresses.

LC Class. No.: TK7882.P3

Dewey Class. No.: 006.4

Pattern recognition and computer vision = 7th Chinese Conference, PRCV 2024, Urumqi, China, October 18-20, 2024 : proceedings.. Part VII /
LDR:05404nmm a2200349 a 4500 001 2407932
003 DE-He213
005 20241102115737.0
006 m d
007 cr nn 008maaau
008 260204s2025 si s 0 eng d
020 $a 9789819785117 $q (electronic bk.)
020 $a 9789819785100 $q (paper)
024 7 $a 10.1007/978-981-97-8511-7 $2 doi
035 $a 978-981-97-8511-7
040 $a GP $c GP
041 0 $a eng
050 4 $a TK7882.P3
072 7 $a UYT $2 bicssc
072 7 $a COM016000 $2 bisacsh
072 7 $a UYT $2 thema
082 0 4 $a 006.4 $2 23
090 $a TK7882.P3 $b P921 2024
111 2 $a PRCV (Conference) $n (7th : $d 2024 : $c Ürümqi, China) $3 3779991
245 1 0 $a Pattern recognition and computer vision $h [electronic resource] : $b 7th Chinese Conference, PRCV 2024, Urumqi, China, October 18-20, 2024 : proceedings. $n Part VII / $c edited by Zhouchen Lin ... [et al.].
246 3 $a PRCV 2024
260 $a Singapore : $b Springer Nature Singapore : $b Imprint: Springer, $c 2025.
300 $a xiv, 587 p. : $b ill. (chiefly color), digital ; $c 24 cm.
490 1 $a Lecture notes in computer science, $x 1611-3349 ; $v 15037
505 0 $a Scene Text Recognition via k-NN Attention-based Decoder and Margin-based Softmax LossReal-Time Text Detection with Multi-Level Feature Fusion and Pixel ClusteringREFINED AND LOCALITY-ENHANCED FEATURE FOR HANDWRITTEN MATHEMATICAL EXPRESSION RECOGNITIONLearning Fine-grained and Semantically Aware Mamba Representations for Tampered Text Detection in ImagesDual Feature Enhanced Scene Text Recognition Method for Low-Resource UyghurSegmentation-free Todo Mongolian OCR and Its Public DatasetHybrid Encoding Method for Scene Text Recognition in Low-Resource UyghurROBC: a Radical-Level Oracle Bone Character DatasetIntegrated Recognition of Arbitrary-Oriented Multi-Line Billet NumberImproving Scene Text Recognition with Counting Aware Contrastive Learning and Attention AlignmentGridMask: An Efficient Scheme for Real Time Curved Scene Text DetectionTibetan Handwriting Recognition Method based on Structural Re-parameterization ViT and Vertical AttentionMFH: Marrying Frequency Domain with Handwritten Mathematical Expression RecognitionLeveraging Structure Knowledge and Deep Models for the Detection of Abnormal Handwritten Text -- OCR-aware Scene Graph Generation via Multi-modal Object Representation Enhancement and Logical Bias Learning -- Enhancing Transformer-based Table Structure Recognition for Long Tables -- Show Exemplars and Tell Me What You See: In-context Learning with Frozen Large Language Models for Text -- VQAMLR-NET: an arbitrary skew angle detection algorithm for complex layout document images -- TextViTCNN： Enhancing Natural Scene Text Recognition with Hybrid Transformer and Convolutional NetworksEnhancing Visual Information Extraction with Large Language Models through Layout-aware Instruction Tuning -- SFENet: Arbitrary Shapes Scene Text Detection with Semantic Feature ExtractorImproving Zero-Shot Image Captioning Efficiency with Metropolis-Hastings Sampling -- Improving Text Classification Performance through Multimodal Representation -- A Multi-feature Fusion Approach for Words Recognition of Ancient Mongolian Documents -- TableRocket: An Efficient and Effective Framework for Table Reconstruction -- Not All Texts Are the Same: Dynamically Querying Texts for Scene Text Detection -- Multi-Modal Attention based on 2D Structured Sequence for Table Recognition -- A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human Interaction Recognition -- Skeleton-Language Pre-training to Collaborate with Self-Supervised Human Action Recognition -- Spatio-Temporal Contrastive Learning for Compositional Action RecognitionPath-Guided Motion Prediction with Multi-View Scene Perception -- Privacy-preserving Action Recognition: A Survey -- Attention-based Spatio-temporal modeling with 3D Convolutional Neural Networks for Dynamic Gesture Recognition -- MIT: Multi-cue Injected Transformer for Two-stage HOI Detection -- DIDA: Dynamic Individual-to-integrated Augmentation for Self-Supervised Skeleton-Based Action Recognition -- Multi-scale Spatial and Temporal Feature Aggregation Graph Convolutional Network for Skeleton-Based Action Recognition -- Improving Video Representation of Vision-Language Model with Decoupled Explicit Temporal Modeling -- KS-FuseNet: An efficient action recognition method based on keyframe selection and feature fusion -- Dynamic Skeleton Association Transformer for dyadic Interaction Action RecognitionSpecies-Aware Guidance for Animal Action Recognition with Vision-Language Knowledge.
520 $a This 15-volume set LNCS 15031-15045 constitutes the refereed proceedings of the 7th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2024, held in Urumqi, China, during October 18-20, 2024. The 579 full papers presented were carefully reviewed and selected from 1526 submissions. The papers cover various topics in the broad areas of pattern recognition and computer vision, including machine learning, pattern classification and cluster analysis, neural network and deep learning, low-level vision and image processing, object detection and recognition, 3D vision and reconstruction, action recognition, video analysis and understanding, document analysis and recognition, biometrics, medical image analysis, and various applications.
650 0 $a Computer vision $x Congresses. $3 570734
650 0 $a Pattern recognition systems $v Congresses. $3 563039
650 1 4 $a Computer Imaging, Vision, Pattern Recognition and Graphics. $3 890871
650 2 4 $a Artificial Intelligence. $3 769149
650 2 4 $a Computer and Information Systems Applications. $3 3538505
650 2 4 $a Computer Communication Networks. $3 775497
650 2 4 $a Computer System Implementation. $3 892710
650 2 4 $a Machine Learning. $3 3382522
700 1 $a Lin, Zhouchen. $3 3453538
710 2 $a SpringerLink (Online service) $3 836513
773 0 $t Springer Nature eBook
830 0 $a Lecture notes in computer science ; $v 15037. $3 3780101
856 4 0 $u https://doi.org/10.1007/978-981-97-8511-7
950 $a Computer Science (SpringerNature-11645)