Publications

(2023). Uncurated image-text datasets: Shedding light on demographic bias. Proc.~IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

URL

(2023). Toward verifiable and reproducible human evaluation for text-to-image generation. Proc.~IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

URL

(2023). Not only generative art: Stable diffusion for content-style disentanglement in art analysis. Proc.~ 2023 ACM International Conference on Multimedia Retrieval (ICMR).

DOI URL

(2023). Multi-modal humor segment prediction in video. Multimedia Systems.

DOI URL

(2023). Model-agnostic gender debiased image captioning. Proc.~IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

URL

(2023). Learning bottleneck concepts in image classification. Proc.~IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

URL

(2023). ICDAR’23: Intelligent Cross-Data Analysis and Retrieval. Proc.~ACM International Conference on Multimedia Retrieval.

(2023). Real-time estimation of the remaining surgery duration for cataract surgery using deep convolutional neural networks and long short-term memory. BMC Medical Informatics and Decision Making.

DOI URL

(2023). Inverse Rendering of Translucent Objects using Physical and Neural Renderers. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.

(2023). Human-Imperceptible Identification With Learnable Lensless Imaging. IEEE Access.

URL

(2023). Development of a vertex finding algorithm using Recurrent Neural Network. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment.

(2023). Cross-language font style transfer. Applied Intelligence.

(2023). Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.

(2023). Automated grading system of retinal arterio-venous crossing patterns: A deep learning approach replicating ophthalmologist’s diagnostic process of arteriolosclerosis. PLOS Digital Health.

DOI URL

(2022). Quantifying Societal Bias Amplification in Image Captioning. Proc.~IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

PDF

(2022). Acquiring a Dynamic Light Field Through a Single-Shot Coded Image. Proc.~IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

PDF

(2022). Tone Classification for Political Advertising Video using Multimodal Cues. Proceedings of the 3rd ACM Workshop on Intelligent Cross-Data Analysis and Retrieval.

(2022). Multi-label disengagement and behavior prediction in online learning. Artificial Intelligence in Education: 23rd International Conference, AIED 2022, Durham, UK, July 27–31, 2022, Proceedings, Part I.

(2022). Match them up: visually explainable few-shot image classification. Applied Intelligence.

DOI URL

(2022). ICDAR'22: Intelligent Cross-Data Analysis and Retrieval. Proceedings of the 2022 International Conference on Multimedia Retrieval.

(2022). Emotional Intensity Estimation based on Writer’s Personality. Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Student Research Workshop.

(2022). Depthwise spatio-temporal STFT convolutional neural networks for human action recognition. IEEE Trans.~Pattern Analysis and Machine Intelligence.

DOI URL

(2022). Deep Gesture Generation for Social Robots Using Type-Specific Libraries. 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

(2022). Corpus Construction for Historical Newspapers: A Case Study on Public Meeting Corpus Construction Using OCR Error Correction. SN Computer Science.

(2021). Transferring domain-agnostic knowledge in video question answering. Proc.~British Machine Vision Conference (BMVC).

(2021). Image Retrieval by Hierarchy-aware Deep Hashing Based on Multi-task Learning. Proc.~ACM International Conference on Multimedia Retrieval (ICMR).

URL

(2021). GCNBoost: Artwork Classificationby Label Propagation Through a Knowledge Graph. Proc.~ACM International Conference on Multimedia Retrieval (ICMR).

PDF

(2021). Explain me the painting: Multi-topic knowledgeable art description generation. Proc.~IEEE/CVF International Conference on Computer Vision (ICCV).

PDF

(2021). Built year prediction from Buddha face with heterogeneous labels. Proc.~Workshop on Structuring and Understanding of Multimedia Heritage Contents (SUMAC).

URL

(2021). PoseRN: A 2D pose refinement network for bias-free multi-view 3D human pose estimation. Proc.~International Conference on Image Processing (ICIP).

PDF

(2021). Learners' efficiency prediction using facial behavior analysis. Proc.~International Conference on Image Processing (ICIP).

URL

(2021). Attending self-attention: A case study of visually grounded supervision in vision-and-language transformers. Proc.~Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop.

URL

(2021). A comparative study of language Transformers for video question answering. Neurocomputing.

DOI URL

(2021). WRIME: A new dataset for emotional intensity estimation with subjective and objective annotations. Proc.~Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT).

URL

(2021). Noisy-LSTM: Improving temporal awareness for video semantic segmentation. IEEE Access.

DOI URL

(2021). The laughing machine: Predicting humor in video. Proceedings - IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

PDF URL

(2021). Preventing fake information generation against media clone attacks. IEICE Transactions on Information and Systems.

PDF DOI URL

(2021). Generation and detection of media clones. *IEICE Transactions on Information and Systems *.

PDF DOI URL

(2021). CFA Handling and Quality Analysis for Compressive Light Field Camera. ITE Transactions on Media Technology and Applications.

DOI

(2020). Cross-lingual visual grounding. IEEE Access.

PDF DOI URL

(2020). IDSOU at WNUT-2020 Task 2: Identification of informative COVID-19 English tweets. Proceedings - Workshop on Noisy User-generated Text (W-NUT 2020).

PDF

(2020). Improving topic modeling through homophily for legal documents. Applied Network Science.

PDF DOI URL

(2020). Following Embryonic Stem Cells, Their Differentiated Progeny, and Cell-State Changes During iPS Reprogramming by Raman Spectroscopy. Analytical Chemistry.

PDF DOI URL

(2020). Diagnostic performance for pulmonary adenocarcinoma on CT: comparison of radiologists with and without three-dimensional convolutional neural network. European Radiology.

DOI

(2020). Visually grounded paraphrase identification via gating and phrase localization. Neurocomputing.

PDF DOI URL

(2020). Red-Fluorescent Pt Nanoclusters for Detecting and Imaging HER2 in Breast Cancer Cells. ACS Omega.

PDF DOI URL

(2020). Improvement of nerve imaging speed with coherent anti-Stokes Raman scattering rigid endoscope using deep-learning noise reduction. Scientific Reports.

PDF DOI

(2020). YOLO in the Dark - Domain adaptation method for merging multiple models -. Proceedings - European Conference on Computer Vision.

PDF

(2020). Knowledge-based video question answering with unsupervised scene descriptions. Proceedings - European Conference on Computer Vision.

PDF URL

(2020). Demographic influences on contemporary art with unsupervised style embeddings. Proceedings - European Conference on Computer Vision Workshops.

(2020). Acquiring dynamic light fields through coded aperture camera. Proceedings - European Conference on Computer Vision.

PDF

(2020). Nerve segmentation with deep learning from label-free endoscopic images obtained using coherent anti-stokes Raman scattering. Biomolecules.

PDF DOI URL

(2020). 公開集会記事からの情報抽出.

(2020). OCR誤り訂正を⽤いた歴史新聞データからのコーパス構築.

(2020). Constructing a public meeting corpus. Proceedings - the 12th International Conference on Language Resources and Evaluation (LREC 2020).

PDF

(2020). Yoga-82: a new dataset for fine-grained classification of human poses. The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.

PDF arXiv

(2020). Convolutional Neural Network Can Recognize Drug Resistance of Single Cancer Cells. International Journal of Molecular Sciences.

PDF DOI URL

(2020). Detecting learner drowsiness based on facial expressions and head movements in online courses. International Conference on Intelligent User Interfaces, Proceedings IUI.

DOI

(2020). KnowIT VQA: Answering knowledge-based questions about videos. Proceedings - 2020 AAAI Conference on Artificial Intelligence.

PDF arXiv URL

(2020). Warmer Environments Increase Implicit Mental Workload Even If Learning Efficiency Is Enhanced. Frontiers in Psychology.

DOI

(2020). Speech-driven face reenactment for a video sequence. ITE Transactions on Media Technology and Applications.

DOI

(2020). Joint learning of vessel segmentation and artery/vein classification with post-processing. Medical Imaging with Deep Learning (MIDL).

PDF arXiv URL

(2020). IterNet: retinal image segmentation utilizing structural redundancy in vessel networks. Proceedings - The IEEE Winter Conference on Applications of Computer Vision (WACV).

PDF DOI arXiv URL

(2020). ContextNet: representation and exploration for painting classification and retrieval in context. International Journal of Multimedia Information Retrieval.

DOI

(2020). BERT representations for video question answering. Proceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020.

DOI

(2020). Action recognition from a single coded image. Proceedings - 2020 IEEE International Conference on Computational Photography (ICCP).

PDF URL

(2020). 5D Light Field Synthesis from a Monocular Video. International Conference on Pattern Recognition.

PDF

(2020). 3D Image Reconstruction from Multi-focus Microscopic Images. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).

DOI

(2019). 歴史研究におけるビッグデータの活用-オーストラリアを中心に. 西洋史学.

(2019). Reflectance and Shape Estimation with a Light Field Camera Under Natural Illumination. International Journal of Computer Vision.

DOI

(2019). Public meeting corpus construction and content delivery. 人文科学とコンピュータシンポジウム2019.

(2019). Deep-UV excitation fluorescence microscopy for detection of lymph node metastasis using deep neural network. Scientific Reports.

PDF DOI URL

(2019). Contextualized multi-sense word embedding. Journal of Natural Language Processing.

PDF DOI URL

(2019). Legal information as a complex network: Improving topic modeling through homophily. Proceedings - International Conference on Complex Networks and Their Applications.

DOI

(2019). Human shape reconstruction with loose clothes from partially observed data by pose specific deformation. Proceedings - Pacific-Rim Symposium on Image and Video Technology.

DOI

(2019). Deep compressive sensing for visual privacy protection in flatcam imaging. Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019.

DOI

(2019). Metric for automatic machine translation evaluation based on pre-trained sentence embeddings. Journal of Natural Language Processing.

PDF DOI URL

(2019). A 3-D Display Pipeline from Coded-Aperture Camera to Tensor Light-Field Display Through CNN. Proceedings - International Conference on Image Processing, ICIP.

DOI

(2019). Excitation of erbium-doped nanoparticles in 1550-nm wavelength region for deep tissue imaging with reduced degradation of spatial resolution. Journal of Biomedical Optics.

PDF DOI URL

(2019). Application of deep learning (3-dimensional convolutional neural network) for the prediction of pathological invasiveness in lung adenocarcinoma. Medicine.

PDF DOI URL

(2019). 歴史新聞データからのコーパス構築.

(2019). Multimodal learning analytics: Society 5.0 project in Japan. Companion Proceedings of the 9th International Conference on Learning Analytics & Knowledge.

(2019). Fall detection using optical level anonymous image sensing system. Optics and Laser Technology.

DOI

(2019). Video meets knowledge in visual question answering. 画像の認識・理解シンポジウム(MIRU2019)論文集.

(2019). Rethinking the evaluation of video summaries. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

DOI arXiv

(2019). Negative lexically constrained decoding for paraphrase generation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019).

PDF DOI URL

(2019). Historical and modern features for Buddha statue classification. SUMAC 2019 - Proceedings of the 1st Workshop on Structuring and Understanding of Multimedia heritAge Contents, co-located with MM 2019.

DOI

(2019). High-Speed Imaging Using CMOS Image Sensor With Quasi Pixel-Wise Exposure. IEEE Transactions on Computational Imaging.

DOI URL

(2019). Facial expression recognition with skip-connection to leverage low-level features. Proceedings - IEEE International Conference on Image Processing (ICIP).

PDF DOI URL

(2019). Efficacy of Novel Multispectral Imaging Device to Determine Anastomosis for Esophagogastrostomy. Journal of Surgical Research.

DOI

(2019). Controllable text simplification with lexical constraint loss. Proceedings of the ACL 2019 Student Research Workshop (ACL 2019 SRW).

PDF DOI URL

(2019). Contextualized context2vec. Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019).

PDF DOI URL

(2019). Context-aware embeddings for automatic art analysis. Proceedings of the 2019 ACM International Conference on Multimedia Retrieval (ICMR).

DOI

(2019). Buda.art: A multimodal content-based analysis and retrieval system for Buddha statues. Proceedings of the 27th ACM International Conference on Multimedia (MM).

DOI

(2019). A Coded Aperture for Watermark Extraction from Defocused Images. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).

DOI

(2018). Space-time-brightness sampling using an adaptive pixel-wise coded exposure. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

DOI

(2018). Representing a partially observed non-rigid 3D human using eigen-texture and eigen-deformation. Proceedings - International Conference on Pattern Recognition (ICPR).

DOI arXiv

(2018). Finding important people in a video using deep neural networks with conditional random fields. IEICE Transactions on Information and Systems.

DOI

(2018). Invited Article: Label-free nerve imaging with a coherent anti-Stokes Raman scattering rigid endoscope using two optical fibers for laser delivery. APL Photonics.

PDF DOI URL

(2018). Designing coded aperture camera based on PCA and NMF for light field acquisition. IEICE Transactions on Information and Systems.

DOI

(2018). Summarization of user-generated sports video by using deep action recognition features. IEEE Transactions on Multimedia.

PDF DOI arXiv URL

(2018). Iterative applications of image completion with CNN-based failure detection. Journal of Visual Communication and Image Representation.

DOI

(2018). iParaphrasing: Extracting visually grounded paraphrases via an image. Proceedings - International Conference on Compuational Linguistics (COLING).

PDF arXiv URL

(2018). PCA-coded aperture for light field photography. Proceedings - International Conference on Image Processing, ICIP.

DOI

(2018). Visually grounded paraphrase extraction. Proceedings of the 27th International Conference on Computational Linguistics.

(2018). The dynamic photometric stereo method using a multi-tap CMOS image sensor. Sensors (Switzerland).

DOI

(2018). RUSE: Regressor using sentence embeddings for automatic machine translation evaluation. Proceedings of the Third Conference on Machine Translation: Shared Task Papers (WMT 18).

PDF DOI URL

(2018). Metric for automatic machine translation evaluation based on universal sentence representations. Proceedings of the NAACL 2018 Student Research Workshop (NAACL 2018 SRW).

PDF DOI URL

(2018). Learning to capture light fields through a coded aperture camera. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).

DOI

(2018). Joint optimization for compressive video sensing and reconstruction under hardware constraints. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).

DOI

(2018). Graphical classification of DNA sequences of HLA alleles by deep learning. Human Cell.

DOI

(2018). Complex word identification based on frequency in a learner corpus. Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications (BEA 13).

PDF DOI URL

(2018). Coherent anti-stokes Raman scattering rigid endoscope toward robot-assisted surgery. Biomedical Optics Express.

DOI

(2017). Adapting local features for face detection in thermal image. Sensors (Switzerland).

DOI

(2017). Augmented reality marker hiding with texture deformation. IEEE Transactions on Visualization and Computer Graphics.

DOI

(2017). Adaptive background model registration for moving cameras. Pattern Recognition Letters.

DOI

(2017). Novel view synthesis with light-weight view-dependent texture mapping for a stereoscopic HMD. Proceedings - IEEE International Conference on Multimedia and Expo.

DOI

(2017). Video summarization using textual descriptions for authoring video blogs. Multimedia Tools and Applications.

DOI

(2017). Hyperspectral imaging using flickerless active LED illumination. Thirteenth International Conference on Quality Control by Artificial Vision 2017.

DOI

(2017). Video question answering to find a desired video eegment. Proceedings - Open Knowledge Base and Question Answering Workshop at SIGIR.

(2017). Unsupervised Video Summarization using Deep Video Features. 画像の認識・理解シンポジウム(MIRU2017)論文集.

(2017). ReMagicMirror: Action learning using human reenactment with the mirror metaphor. Proceedings - International Conference on Multimedia Modeling (MMM).

DOI

(2017). Realtime novel view synthesis with eigen-texture regression. Proceedings - British Machine Vision Conference (BMVC).

PDF

(2017). Mixed features for face detection in thermal image. Proceedings of SPIE - The International Society for Optical Engineering.

DOI

(2017). Incremental structural modeling on sparse visual SLAM. Proceedings of the 15th IAPR International Conference on Machine Vision Applications, MVA 2017.

DOI

(2017). Increasing pose comprehension through augmented reality reenactment. Multimedia Tools and Applications.

DOI

(2017). Fine-grained video retrieval for multi-clip video. Proceeedings - Workshop on Closing the Loop Between Vision and Language at ICCV.

(2017). Classification of C2C12 cells at differentiation by convolutional neural network of deep learning using phase contrast images. Human Cell.

DOI

(2016). High-speed imaging using CMOS image sensor with quasi pixel-wise exposure. 2016 IEEE International Conference on Computational Photography, ICCP 2016 - Proceedings.

DOI

(2016). Dynamic photometric stereo method using multi-tap CMOS image sensor. Proceedings - International Conference on Pattern Recognition.

DOI