Skip to main content

Open-Source Activities

Implementations

Official

KOFFVQA
KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language
CANVAS
CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction
HerO
HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims
EnCLAP
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
PhaseAug
PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping
Assem-VC
Assem-VC: Realistic Voice Conversion by Assembling Modern Speech Synthesis Techniques
NU-Wave
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
NU-Wave 2
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates
Cotatron
Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data

Unofficial

pNLP-Mixer
pNLP-Mixer: an Efficient all-MLP Architecture for Language
First successful open-source implementation of pNLP-Mixer.
HifiFace
HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping
First successful open-source implementation of HifiFace.
UnivNet
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
First successful open-source implementation of UnivNet.
WaveGrad2
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
First successful open-source implementation of WaveGrad2.
FaceShifter
FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping
First successful open-source implementation of FaceShifter.
Reformer-pytorch
Reformer: The Efficient Transformer
Implementation of Reformer in PyTorch.
MelGAN
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Implementation of MelGAN vocoder (compatible with NVIDIA/tacotron2)
MelNet
MelNet: A Generative Model for Audio in the Frequency Domain
Implementation of MelNet. Work done with Deepest AI (SNU Deep Learning Society).
RandWireNN
Exploring Randomly Wired Neural Networks for Image Recognition
Implementation of RandWireNN.
VoiceFilter
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
First successful open-source implementation of Google's VoiceFilter

Libraries & Tools

KoTDG
Korean Text Data Generator for OCR tasks.
Data-Science-for-COVID-19
COVID-19 Korea Dataset with patient routes and visualizer. Co-led the collaboration project.
Alias-Free-Torch
Simple torch.nn.module implementation of Alias-Free-GAN style filter and resample.

Implementations
- Official
- Unofficial
Libraries & Tools