網頁

2020年3月12日 星期四

解讀keras-gcn程式碼

1. cora dataset介紹
cora有2708個數據點,每個數據點代表一篇科學論文,論文可以分成7個類別,個別為

(a) 規則學習 (Rule Learning)
(b) 強化學習 (Reinforcement Learning)
(c) 機率方法 (Probabilistic Methods)
(d) 理論 (Theory)
(e) 神經網路 (Neural Networks)
(f) 基因演算法(Genetic Algorithms)
(g) 案例式 (Case Based)

core有兩個檔案cora.cites和cora.content

cora.content總共有2708行,每一行分別是論文編號、論文的詞向量,一個有1433位的二進制和論文的類別

論文編號    論文詞向量(1433位二進制)   論文類別
210XX         0 0..... 0 0 1 0 0 0 0 0 0 0 0 0   Neural_Networks
10723XX     0 0..... 0 0 0 0 0 1 0 0 0 0 0 0   Rule_Learning
11537XX     0 0..... 0 0 0 0 0 0 0 0 1 0 0      Reinforcement_Learning

cora.cites總共有5429行, 每一行有兩個論文編號,第一個編號的論文先發表,第二個編號的論文引用第一個編號的論文

第一篇論文編號   第二篇論文編號
35                           1033
35                           103482
35                           103515

參考
https://blog.csdn.net/u010159842/article/details/103234942



https://github.com/tkipf/keras-gcn
https://blog.csdn.net/Eric_1993/article/details/102907104
https://luweikxy.gitbooks.io/machine-learning-notes/content/content/deep-learning/graph-neural-networks/graph-convolutional-networks/gcn-preliminary-understand.html
https://www.zhihu.com/question/54504471






























2020年3月9日 星期一

Lammps安裝過程

Updating Homebrew...
==> Auto-updated Homebrew!
Updated 2 taps (homebrew/core and homebrew/cask).
==> Updated Formulae
camlp5                      lablgtk                     menhir                      ocaml                       ocaml-num                   ocamlsdl
inadyn                      libexosip                   mkvtoolnix                  ocaml-findlib               ocamlbuild                  verilator
==> Deleted Formulae
camlp4
==> Updated Casks
homebrew/cask/amethyst                    homebrew/cask/freeplane                   homebrew/cask/mucommander                 homebrew/cask/tinymediamanager
homebrew/cask/base                        homebrew/cask/j                           homebrew/cask/mudlet                      homebrew/cask/veusz
homebrew/cask/blitz                       homebrew/cask/keystore-explorer           homebrew/cask/numi                        homebrew/cask/xamarin-profiler
homebrew/cask/burn                        homebrew/cask/kode54-cog                  homebrew/cask/pinegrow                    homebrew/cask/yinxiangbiji
homebrew/cask/daisydisk                   homebrew/cask/mcgimp                      homebrew/cask/psychopy
homebrew/cask/electrocrud                 homebrew/cask/mellow                      homebrew/cask/refined-github-safari

==> Installing dependencies for lammps: hwloc, libevent, open-mpi, fftw and kim-api
==> Installing lammps dependency: hwloc
==> Downloading https://homebrew.bintray.com/bottles/hwloc-2.1.0.catalina.bottle.tar.gz
==> Downloading from https://akamai.bintray.com/94/94e4e238c45da330b53fde9c622e74a2dfabd00a17f37fa1807b1d828452759d?__gda__=exp=1583746103~hmac=200ec34b4765b09c65c523c
######################################################################## 100.0%
==> Pouring hwloc-2.1.0.catalina.bottle.tar.gz
==> Caveats
Bash completion has been installed to:
  /usr/local/etc/bash_completion.d
==> Summary
🍺  /usr/local/Cellar/hwloc/2.1.0: 881 files, 9.5MB
==> Installing lammps dependency: libevent
==> Downloading https://homebrew.bintray.com/bottles/libevent-2.1.11_1.catalina.bottle.tar.gz
==> Downloading from https://akamai.bintray.com/9d/9d262f9ffb2268340a89c713826d8ca068bcac06c30baf49e6184ab4660d977a?__gda__=exp=1583746109~hmac=a9d25e6511b5ddd7eeaa86c
######################################################################## 100.0%
==> Pouring libevent-2.1.11_1.catalina.bottle.tar.gz
🍺  /usr/local/Cellar/libevent/2.1.11_1: 1,063 files, 5MB
==> Installing lammps dependency: open-mpi
==> Downloading https://homebrew.bintray.com/bottles/open-mpi-4.0.3.catalina.bottle.tar.gz
==> Downloading from https://akamai.bintray.com/3b/3b143cf02a5345bb0d4df0777d3a34f806ff7fb66dc5d21993b8c4f218722ac7?__gda__=exp=1583746115~hmac=253bae48c6c40c68026dc37
######################################################################## 100.0%
==> Pouring open-mpi-4.0.3.catalina.bottle.tar.gz
🍺  /usr/local/Cellar/open-mpi/4.0.3: 752 files, 11.3MB
==> Installing lammps dependency: fftw
==> Downloading https://homebrew.bintray.com/bottles/fftw-3.3.8_1.catalina.bottle.tar.gz
==> Downloading from https://akamai.bintray.com/27/271f7a5febd92b8757948c86fa24546aca9025087406056b7cf4036ca97b29e4?__gda__=exp=1583746121~hmac=4b173984e6ec8b297aa0ec8
######################################################################## 100.0%
==> Pouring fftw-3.3.8_1.catalina.bottle.tar.gz
🍺  /usr/local/Cellar/fftw/3.3.8_1: 73 files, 14.8MB
==> Installing lammps dependency: kim-api
==> Downloading https://homebrew.bintray.com/bottles/kim-api-2.1.3.catalina.bottle.tar.gz
==> Downloading from https://akamai.bintray.com/ae/ae59f45bf8d539f1b633120cdb942c2a0a0f1880eee4ee91981f268792e1b1c4?__gda__=exp=1583746126~hmac=464da1ff6f45e08326d29d8
######################################################################## 100.0%
==> Pouring kim-api-2.1.3.catalina.bottle.tar.gz
Warning: kim-api dependency gcc was built with a different C++ standard
library (libstdc++ from clang). This may cause problems at runtime.
==> Caveats
Bash completion has been installed to:
  /usr/local/etc/bash_completion.d

zsh completions and functions have been installed to:
  /usr/local/share/zsh/site-functions

Emacs Lisp files have been installed to:
  /usr/local/share/emacs/site-lisp/kim-api
==> Summary
🍺  /usr/local/Cellar/kim-api/2.1.3: 1,285 files, 21.6MB
==> Installing lammps
==> Downloading https://homebrew.bintray.com/bottles/lammps-2019-08-07.catalina.bottle.tar.gz
==> Downloading from https://akamai.bintray.com/60/604056c80bb3b36f0a1644388bff26f2bcdc3e2f2541247e9a7ce941b20e9bcc?__gda__=exp=1583746133~hmac=2bd684fb81d50cc717258b5
######################################################################## 100.0%
==> Pouring lammps-2019-08-07.catalina.bottle.tar.gz
Warning: lammps dependency gcc was built with a different C++ standard
library (libstdc++ from clang). This may cause problems at runtime.
🍺  /usr/local/Cellar/lammps/2019-08-07: 5,788 files, 335.5MB
==> Caveats
==> hwloc
Bash completion has been installed to:
  /usr/local/etc/bash_completion.d
==> kim-api
Bash completion has been installed to:
  /usr/local/etc/bash_completion.d

zsh completions and functions have been installed to:
  /usr/local/share/zsh/site-functions

Emacs Lisp files have been installed to:
  /usr/local/share/emacs/site-lisp/kim-api

Mac install tensorflow 2.1.0, Keras 2.3.1, scikit-learn, and RDkit

Tensorflow:
我選擇system安裝,不使用virtual environment

$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
$ export PATH="/usr/local/bin:/usr/local/sbin:$PATH"
$ brew update
$ brew install python
$ sudo pip3 install -U virtualenv
$ pip3 install --user --upgrade tensorflow

如果是python3.7.3版本的話要修改linecache.py

$ sudo vi /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/linecache.py

第48行改成
for mod in list(sys.modules.values()): # for mod in sys.modules.values():

Keras:

$ brew install hdf5
$ brew install graphviz
$ sudo pip3 install pydot (一定要加sudo)
$ vi ~/.bash_profile
export PATH=/usr/local/bin:/usr/local/sbin:~/bin:$PATH
$ source ~/.bash_profile
$ sudo pip3 install keras (一定要加sudo)

scikit-learn:
$ pip3 install -U scikit-learn

RDkit:
$ brew tap rdkit/rdkit
$ brew install rdkit

If you need to have icu4c first in your PATH run:
  echo 'export PATH="/usr/local/opt/icu4c/bin:$PATH"' >> ~/.zshrc
  echo 'export PATH="/usr/local/opt/icu4c/sbin:$PATH"' >> ~/.zshrc

For compilers to find icu4c you may need to set:
  export LDFLAGS="-L/usr/local/opt/icu4c/lib"
  export CPPFLAGS="-I/usr/local/opt/icu4c/include"

==> openblas
openblas is keg-only, which means it was not symlinked into /usr/local,
because macOS provides BLAS and LAPACK in the Accelerate framework.

For compilers to find openblas you may need to set:
  export LDFLAGS="-L/usr/local/opt/openblas/lib"
  export CPPFLAGS="-I/usr/local/opt/openblas/include"

==> rdkit
      You may need to add RDBASE to your environment variables.
      For Bash, put something like this in your $HOME/.bashrc:
        export RDBASE=/usr/local/share/RDKit

參考
https://github.com/rdkit/homebrew-rdkit

Mac執行tensorflow發生RuntimeError: dictionary changed size during iteration

問題:
Homebrew的python3版本是3.7.3
在執行tensorflow時出現以下

.
.
.
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/linecache.py", line 48, in getlines
    for mod in sys.modules.values():

RuntimeError: dictionary changed size during iteration

解決:
$ sudo vi /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/linecache.py

第48行改成
for mod in list(sys.modules.values()): # for mod in sys.modules.values():



參考
http://www.kaierlong.me/blog/post/kaierlong/tensorflow-2.0-%E9%94%99%E8%AF%AF-RuntimeError-dictionary-changed-size-during-iteration

2020年3月3日 星期二

廣義卷積神經網絡與任意連續數據上李群的等方差


卷積層的平移等方差使卷積神經網絡能夠很好地泛化圖像問題。 雖然平移等方差為圖像提供了強大的歸納偏置(inductive bias),但我們通常還希望與其他變換(例如旋轉)等方差,尤其是對於非圖像數據。 我們提出了一種通用的方法來構造一個卷積層,該卷積層與帶有滿射的(surjective )指數映射(exponential map)的任何指定李群(Lie group)的變換等價。 將等方差合併到新組中僅需要實現組指數映射和對數映射即可實現快速原型製作。 為了展示我們方法的簡單性和通用性,我們將相同的模型體系結構應用於圖像,球棒分子數據和哈密頓動力學系統。 對於漢密爾頓系統,我們模型的等方差尤其具有影響力,從而可以精確地守恆線性和角動量。

https://arxiv.org/pdf/2002.12880.pdf

A Deep Generative Model for Fragment-Based Molecule Generation

In this paper, the authors discuss about the molecule generation, which is a challenging problem in cheminformatics. The two types of deep generative approaches generally used are encoding molecular graphs as strings of text, and learns their corresponding character based language model while another approach operates directly on the molecular graph. But the above approaches have two limitations, like the generation of invalid and duplicate molecules.

To overcome the limitations of the model, the authors of this paper proposed a language model for small molecular structures called fragments, loosely inspired by the well known paradigm of Fragment-Based Drug Design. In simple language, they proposed to generate molecules fragment by fragment, instead of atom by atom.

The authors of this paper experimentally show that their model largely outperforms other language model-based competitors, reaching state-of-the-art performances typical of graph-based approaches.



In terms of the methods used in this paper, the main approach encompasses three steps:

1.Break molecules into sequences of fragments
2.Encode them as SMILES words
3.Learn their corresponding language model

https://arxiv.org/pdf/2002.12826.pdf

https://github.com/marcopodda/fragment-based-dgm

2020年3月2日 星期一

PRML algorithms implemented in Python

Python codes implementing algorithms described in Bishop's book "Pattern Recognition and Machine Learning" : https://github.com/ctgk/PRML

GML:自動化機器學習套件

PyPI 上發佈了一個新的 Python 的自動化機器學習套件:GML(Ghalat Machine Learning),目前已經有以下功能:
✓ 自動選擇機器學習和神經網路模型
✓ 自動超參數調校
✓ 排序模型效果(根據交叉驗證分數)
✓ 推薦最佳模型
(next updates:自動特徵工程)
► 安裝:pip install GML

2020年2月25日 星期二

MIT 學者利用 AI 發現超強抗生素,成果登《Cell》雜誌封面

https://technews.tw/2020/02/25/artificial-intelligence-yields-new-antibiotic/?fbclid=IwAR1UBZvz5xGyVv-DonIlf9DNc0hLx1fDY10P9UuGSzxY5wSxMSf4atnpwG8

COCO-GAN: Generation by Parts via Conditional Coordinating

https://hubert0527.github.io/COCO-GAN/?fbclid=IwAR3rLHb95RCSOf7hjg1XUnpArlOYa8Q56DlWetrwxWzyWuLb7584Je3BLuI

Python iterator and generator

要成為迭代器,首先必須建立一個 __iter__ method告訴編譯器說這個類別是一個迭代器,每次進行迭代時都會呼叫這個method,然後回傳給自己(self)。接著還要建立一個__next__ method好讓外部可以透過內建函式next()去告訴迭代器說要產生下一個元素。


產生器可以透過函式利用yield指令或者tuple的comprehension來產生一系列數字,透過in或者next()可以把產生器裡面的元素產生出來


結論是產生器只能一次產生所有元素,比較適合存放容量小的元素,迭代器可以不斷產生元素,比較適合產生大量的元素

最後無聊寫了一個綜合版本




參考
https://anandology.com/python-practice-book/iterators.html

2020年2月10日 星期一

線性代數在圖論的應用

一 鄰接矩陣
https://ccjou.wordpress.com/2010/01/18/%E7%B7%9A%E6%80%A7%E4%BB%A3%E6%95%B8%E5%9C%A8%E5%9C%96%E8%AB%96%E7%9A%84%E6%87%89%E7%94%A8-%E4%B8%80%EF%BC%9A%E9%84%B0%E6%8E%A5%E7%9F%A9%E9%99%A3/

二 關聯矩陣
https://ccjou.wordpress.com/2013/08/30/%E7%B7%9A%E6%80%A7%E4%BB%A3%E6%95%B8%E5%9C%A8%E5%9C%96%E8%AB%96%E7%9A%84%E6%87%89%E7%94%A8-%E4%BA%8C%EF%BC%9A%E9%97%9C%E8%81%AF%E7%9F%A9%E9%99%A3/

三 拉普拉斯矩陣
https://ccjou.wordpress.com/2014/12/04/%E7%B7%9A%E6%80%A7%E4%BB%A3%E6%95%B8%E5%9C%A8%E5%9C%96%E8%AB%96%E7%9A%84%E6%87%89%E7%94%A8-%E4%B8%89%EF%BC%9A%E6%8B%89%E6%99%AE%E6%8B%89%E6%96%AF%E7%9F%A9%E9%99%A3/

Demystifying Different Variants of Gradient Descent Optimization Algorithm

https://medium.com/hackernoon/demystifying-different-variants-of-gradient-descent-optimization-algorithm-19ae9ba2e9bc

吳恩達《Machine Learning Yearning》

這是一本相當實務的機器學習手冊,內容講述在執行機器學習專案時常遇到的問題以及如何解決,相當值得一讀!

雖然目前仍無繁中譯本,但已有簡中譯本!

原文:
簡中譯本: