Q-навчання (Ukrainian Wikipedia)

Analysis of information sources in references of the Wikipedia article "Q-навчання" in Ukrainian language version.

refsWebsite

Global rank Ukrainian rank

12web.archive.org

1^st place

4books.google.com

3^rd place

11^th place

3doi.org

2^nd place

4^th place

2ualberta.ca

3,600^th place

3,626^th place

2worldcat.org

5^th place

9^th place

2nih.gov

4^th place

5^th place

1utl.pt

low place

1ut.ee

8,317^th place

low place

1incompleteideas.net

low place

1archive.org

6^th place

1leemon.com

low place

1arxiv.org

69^th place

188^th place

1huji.ac.il

3,903^rd place

6,283^rd place

1bkgm.com

low place

1rhul.ac.uk

low place

8,811^th place

1storage.googleapis.com

5,609^th place

3,730^th place

1nips.cc

low place

1aaai.org

9,352^nd place

low place

1microsoft.com

153^rd place

227^th place

aaai.org

van Hasselt, Hado; Guez, Arthur; Silver, David (2015). Deep reinforcement learning with double Q-learning. AAAI Conference on Artificial Intelligence: 2094—2100. Архів оригіналу (PDF) за 6 лютого 2020. Процитовано 4 березня 2020. (англ.)

archive.org

Russell, Stuart J.; Norvig, Peter (2010). Artificial Intelligence: A Modern Approach (вид. Third). Prentice Hall. с. 649. ISBN 978-0136042594. (англ.)

arxiv.org

François-Lavet, Vincent; Fonteneau, Raphael; Ernst, Damien (7 грудня 2015). How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies. arXiv:1512.02011 [cs.LG]. (англ.)

bkgm.com

Tesauro, Gerald (March 1995). Temporal Difference Learning and TD-Gammon. Communications of the ACM. 38 (3): 58—68. doi:10.1145/203330.203343. Архів оригіналу за 9 лютого 2010. Процитовано 8 лютого 2010. (англ.)

books.google.com

Hasselt, Hado van (5 березня 2012). Reinforcement Learning in Continuous State and Action Spaces. У Wiering, Marco; Otterlo, Martijn van (ред.). Reinforcement Learning: State-of-the-Art. Springer Science & Business Media. с. 207—251. ISBN 978-3-642-27645-3. (англ.)
Bozinovski, S. (15 липня 1999). Crossbar Adaptive Array: The first connectionist network that solved the delayed reinforcement learning problem. У Dobnikar, Andrej; Steele, Nigel C.; Pearson, David W.; Albrecht, Rudolf F. (ред.). Artificial Neural Nets and Genetic Algorithms: Proceedings of the International Conference in Portorož, Slovenia, 1999. Springer Science & Business Media. с. 320—325. ISBN 978-3-211-83364-3. (англ.)
Bozinovski, S. (1982). A self learning system using secondary reinforcement. У Trappl, Robert (ред.). Cybernetics and Systems Research: Proceedings of the Sixth European Meeting on Cybernetics and Systems Research. North Holland. с. 397—402. ISBN 978-0-444-86488-8. (англ.)
Barto, A. (24 лютого 1997). Reinforcement learning. У Omidvar, Omid; Elliott, David L. (ред.). Neural Systems for Control. Elsevier. ISBN 978-0-08-053739-9. (англ.)

doi.org

Shteingart, Hanan; Neiman, Tal; Loewenstein, Yonatan (May 2013). The role of first impression in operant learning (PDF). Journal of Experimental Psychology: General (англ.). 142 (2): 476—488. doi:10.1037/a0029550. ISSN 1939-2222. PMID 22924882. Архів оригіналу (PDF) за 26 січня 2021. Процитовано 25 лютого 2020. (англ.)
Tesauro, Gerald (March 1995). Temporal Difference Learning and TD-Gammon. Communications of the ACM. 38 (3): 58—68. doi:10.1145/203330.203343. Архів оригіналу за 9 лютого 2010. Процитовано 8 лютого 2010. (англ.)
Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David; Rusu, Andrei A.; Veness, Joel; Bellemare, Marc G.; Graves, Alex; Riedmiller, Martin; Fidjeland, Andreas K. (Feb 2015). Human-level control through deep reinforcement learning. Nature (англ.). 518 (7540): 529—533. doi:10.1038/nature14236. ISSN 0028-0836. PMID 25719670. (англ.)

huji.ac.il

ratio.huji.ac.il

Shteingart, Hanan; Neiman, Tal; Loewenstein, Yonatan (May 2013). The role of first impression in operant learning (PDF). Journal of Experimental Psychology: General (англ.). 142 (2): 476—488. doi:10.1037/a0029550. ISSN 1939-2222. PMID 22924882. Архів оригіналу (PDF) за 26 січня 2021. Процитовано 25 лютого 2020. (англ.)

incompleteideas.net

Sutton, Richard; Barto, Andrew (1998). Reinforcement Learning: An Introduction. MIT Press. Архів оригіналу за 20 лютого 2020. Процитовано 4 березня 2020. (англ.)

leemon.com

Baird, Leemon (1995). Residual algorithms: Reinforcement learning with function approximation (PDF). ICML: 30—37. (англ.)

microsoft.com

Strehl, Alexander L.; Li, Lihong; Wiewiora, Eric; Langford, John; Littman, Michael L. (2006). Pac model-free reinforcement learning (PDF). Proc. 22nd ICML: 881—888. Архів оригіналу (PDF) за 14 квітня 2021. Процитовано 4 березня 2020. (англ.)

nih.gov

pubmed.ncbi.nlm.nih.gov

Shteingart, Hanan; Neiman, Tal; Loewenstein, Yonatan (May 2013). The role of first impression in operant learning (PDF). Journal of Experimental Psychology: General (англ.). 142 (2): 476—488. doi:10.1037/a0029550. ISSN 1939-2222. PMID 22924882. Архів оригіналу (PDF) за 26 січня 2021. Процитовано 25 лютого 2020. (англ.)
Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David; Rusu, Andrei A.; Veness, Joel; Bellemare, Marc G.; Graves, Alex; Riedmiller, Martin; Fidjeland, Andreas K. (Feb 2015). Human-level control through deep reinforcement learning. Nature (англ.). 518 (7540): 529—533. doi:10.1038/nature14236. ISSN 0028-0836. PMID 25719670. (англ.)

nips.cc

papers.nips.cc

van Hasselt, Hado (2011). Double Q-learning. Advances in Neural Information Processing Systems. 23: 2613—2622. Архів оригіналу (PDF) за 26 березня 2020. Процитовано 4 березня 2020. (англ.)

rhul.ac.uk

cs.rhul.ac.uk

Watkins, C.J.C.H. (1989), Learning from Delayed Rewards (PDF) (Ph.D. thesis), Cambridge University, архів оригіналу (PDF) за 9 вересня 2016, процитовано 4 березня 2020 (англ.)

storage.googleapis.com

patentimages.storage.googleapis.com

Methods and Apparatus for Reinforcement Learning, US Patent #20150100530A1 (PDF). US Patent Office. 9 квітня 2015. Архів оригіналу (PDF) за 29 липня 2018. Процитовано 28 липня 2018. (англ.)

ualberta.ca

webdocs.cs.ualberta.ca

Sutton, Richard S.; Barto, Andrew G. 2.7 Optimistic Initial Values. Reinforcement Learning: An Introduction. Архів оригіналу за 8 вересня 2013. Процитовано 18 липня 2013. [Архівовано 2013-09-08 у Wayback Machine.] (англ.)
Maei, Hamid; Szepesvári, Csaba; Bhatnagar, Shalabh; Sutton, Richard (2010). Toward off-policy learning control with function approximation in Proceedings of the 27th International Conference on Machine Learning (PDF). с. 719—726. Архів оригіналу (PDF) за 8 вересня 2012. Процитовано 25 січня 2016. [Архівовано 2012-09-08 у Wayback Machine.] (англ.)

ut.ee

neuro.cs.ut.ee

Matiisen, Tambet (19 грудня 2015). Demystifying Deep Reinforcement Learning. neuro.cs.ut.ee (амер.). Computational Neuroscience Lab. Архів оригіналу за 7 квітня 2018. Процитовано 6 квітня 2018. (англ.)

utl.pt

users.isr.ist.utl.pt

Melo Francisco S. Convergence of Q-learning: a simple proof. Архівовано з джерела 18 листопада 2017. Процитовано 23 лютого 2020. (англ.)

web.archive.org

Melo Francisco S. Convergence of Q-learning: a simple proof. Архівовано з джерела 18 листопада 2017. Процитовано 23 лютого 2020. (англ.)
Matiisen, Tambet (19 грудня 2015). Demystifying Deep Reinforcement Learning. neuro.cs.ut.ee (амер.). Computational Neuroscience Lab. Архів оригіналу за 7 квітня 2018. Процитовано 6 квітня 2018. (англ.)
Sutton, Richard; Barto, Andrew (1998). Reinforcement Learning: An Introduction. MIT Press. Архів оригіналу за 20 лютого 2020. Процитовано 4 березня 2020. (англ.)
Sutton, Richard S.; Barto, Andrew G. 2.7 Optimistic Initial Values. Reinforcement Learning: An Introduction. Архів оригіналу за 8 вересня 2013. Процитовано 18 липня 2013. [Архівовано 2013-09-08 у Wayback Machine.] (англ.)
Shteingart, Hanan; Neiman, Tal; Loewenstein, Yonatan (May 2013). The role of first impression in operant learning (PDF). Journal of Experimental Psychology: General (англ.). 142 (2): 476—488. doi:10.1037/a0029550. ISSN 1939-2222. PMID 22924882. Архів оригіналу (PDF) за 26 січня 2021. Процитовано 25 лютого 2020. (англ.)
Tesauro, Gerald (March 1995). Temporal Difference Learning and TD-Gammon. Communications of the ACM. 38 (3): 58—68. doi:10.1145/203330.203343. Архів оригіналу за 9 лютого 2010. Процитовано 8 лютого 2010. (англ.)
Watkins, C.J.C.H. (1989), Learning from Delayed Rewards (PDF) (Ph.D. thesis), Cambridge University, архів оригіналу (PDF) за 9 вересня 2016, процитовано 4 березня 2020 (англ.)
Methods and Apparatus for Reinforcement Learning, US Patent #20150100530A1 (PDF). US Patent Office. 9 квітня 2015. Архів оригіналу (PDF) за 29 липня 2018. Процитовано 28 липня 2018. (англ.)
van Hasselt, Hado (2011). Double Q-learning. Advances in Neural Information Processing Systems. 23: 2613—2622. Архів оригіналу (PDF) за 26 березня 2020. Процитовано 4 березня 2020. (англ.)
van Hasselt, Hado; Guez, Arthur; Silver, David (2015). Deep reinforcement learning with double Q-learning. AAAI Conference on Artificial Intelligence: 2094—2100. Архів оригіналу (PDF) за 6 лютого 2020. Процитовано 4 березня 2020. (англ.)
Strehl, Alexander L.; Li, Lihong; Wiewiora, Eric; Langford, John; Littman, Michael L. (2006). Pac model-free reinforcement learning (PDF). Proc. 22nd ICML: 881—888. Архів оригіналу (PDF) за 14 квітня 2021. Процитовано 4 березня 2020. (англ.)
Maei, Hamid; Szepesvári, Csaba; Bhatnagar, Shalabh; Sutton, Richard (2010). Toward off-policy learning control with function approximation in Proceedings of the 27th International Conference on Machine Learning (PDF). с. 719—726. Архів оригіналу (PDF) за 8 вересня 2012. Процитовано 25 січня 2016. [Архівовано 2012-09-08 у Wayback Machine.] (англ.)

worldcat.org

search.worldcat.org

Shteingart, Hanan; Neiman, Tal; Loewenstein, Yonatan (May 2013). The role of first impression in operant learning (PDF). Journal of Experimental Psychology: General (англ.). 142 (2): 476—488. doi:10.1037/a0029550. ISSN 1939-2222. PMID 22924882. Архів оригіналу (PDF) за 26 січня 2021. Процитовано 25 лютого 2020. (англ.)
Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David; Rusu, Andrei A.; Veness, Joel; Bellemare, Marc G.; Graves, Alex; Riedmiller, Martin; Fidjeland, Andreas K. (Feb 2015). Human-level control through deep reinforcement learning. Nature (англ.). 518 (7540): 529—533. doi:10.1038/nature14236. ISSN 0028-0836. PMID 25719670. (англ.)