Char :: 자연어처리 (NLP) - 워드 임베딩 (구글)

자연어처리 (NLP) - 워드 임베딩 (구글)

07.AI 2020. 7. 13. 11:53

728x90

1. 단어 수준 임베딩 모델 : 단어를 백터화, 단어 간의 유사도 측정

- 예측 기반 : NPLM, Word2Vec, FastText 등

- 행렬분해 기반 : LSA, GloVe, Swivel 등

2. 문장 수준 임베딩 모델 : 문서(문장)을 벡터화, 문서 간의 유사도 측정

- 확률 기반 : LDA

- 행렬분해 기반 : LSA

- 뉴럴네트워크 기반 : Doc2Vec, ELMo, GPT, BERT

( https://developers.googleblog.com/2018/04/text-embedding-models-contain-bias.html

Text Embedding Models Contain Bias. Here's Why That Matters.

Human data encodes human biases by default. Being aware of this is a good start, and the conversation around how to handle it is ongoing. At Google, we are actively researching unintended bias analysis and mitigation strategies because we are committed to

developers.googleblog.com

728x90

'07.AI' 카테고리의 다른 글

표준 - DIN SPEC 92001:2019 (0)	2020.07.13
표준 (0)	2020.07.13
추천 시스템 - 가짜뉴스 (Fake News) (0)	2020.07.13
머신러닝 - 분류 - 우도확률 (Likehood) (0)	2020.07.13
머신러닝 - 군집 (0)	2020.07.13