The twenty last years have been marked by an increase in available data and computing power. In parallel to this trend, the focus of neural network research and the practice of training neural networks has undergone a number of important changes, for example, use of deep learning machines. The second edition of the book augments the first edition with more tricks, which have resulted from 14 years of theory and experimentation by some of the world's most prominent neural network researchers. These tricks can make a substantial difference (in terms of speed, ease of implementation, and accuracy) when it comes to putting algorithms to work on real problems.
Cited By
- Banerjee K, Singh A, Akhtar N and Vats I (2024). Machine-Learning-Based Accessibility System, SN Computer Science, 5:3, Online publication date: 28-Feb-2024.
- Malinovsky G, Mishchenko K and Richtárik P Server-Side Stepsizes and Sampling Without Replacement Provably Help in Federated Optimization Proceedings of the 4th International Workshop on Distributed Machine Learning, (85-104)
- S.K P, Kesanapalli S and Simmhan Y (2022). Characterizing the Performance of Accelerated Jetson Edge Devices for Training Deep Learning Models, Proceedings of the ACM on Measurement and Analysis of Computing Systems, 6:3, (1-26), Online publication date: 1-Dec-2022.
- Karmaker (“Santu”) S, Hassan M, Smith M, Xu L, Zhai C and Veeramachaneni K (2021). AutoML to Date and Beyond: Challenges and Opportunities, ACM Computing Surveys, 54:8, (1-36), Online publication date: 30-Nov-2022.
- Petrolo R, Shaikhanov Z, Lin Y and Knightly E (2021). ASTRO, ACM Transactions on Internet of Things, 2:4, (1-22), Online publication date: 30-Nov-2021.
- Teng Y, Chen H, Yang D, Pignolet Y, Li T and Chen L On Influencing the Influential Proceedings of the 30th ACM International Conference on Information & Knowledge Management, (1804-1813)
- Li C, Peng X, Peng H, Wu J, Wang L, Yu P, Li J and Sun L Graph-based Semi-Supervised Learning by Strengthening Local Label Consistency Proceedings of the 30th ACM International Conference on Information & Knowledge Management, (3201-3205)
- Yeh C, Zhuang Z, Wang J, Zheng Y, Ebrahimi J, Mercer R, Wang L and Zhang W Online Multi-horizon Transaction Metric Estimation with Multi-modal Learning in Payment Networks Proceedings of the 30th ACM International Conference on Information & Knowledge Management, (4331-4340)
- Lakhmiri D, Digabel S and Tribes C (2021). HyperNOMAD, ACM Transactions on Mathematical Software, 47:3, (1-27), Online publication date: 30-Sep-2021.
- Wang Z, Long C, Cong G and Zhang Q Error-Bounded Online Trajectory Simplification with Multi-Agent Reinforcement Learning Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, (1758-1768)
- Qian B, Su J, Wen Z, Jha D, Li Y, Guan Y, Puthal D, James P, Yang R, Zomaya A, Rana O, Wang L, Koutny M and Ranjan R (2020). Orchestrating the Development Lifecycle of Machine Learning-based IoT Applications, ACM Computing Surveys, 53:4, (1-47), Online publication date: 31-Jul-2021.
- Guo B, Ding Y, Yao L, Liang Y and Yu Z (2020). The Future of False Information Detection on Social Media, ACM Computing Surveys, 53:4, (1-36), Online publication date: 31-Jul-2021.
- Fu S, Liu W, Guan W, Zhou Y, Tao D and Xu C (2021). Dynamic Graph Learning Convolutional Networks for Semi-supervised Classification, ACM Transactions on Multimedia Computing, Communications, and Applications, 17:1s, (1-13), Online publication date: 31-Jan-2021.
- Zhang L and Lu H A Feature-Importance-Aware and Robust Aggregator for GCN Proceedings of the 29th ACM International Conference on Information & Knowledge Management, (1813-1822)
- Zhang W, Miao X, Shao Y, Jiang J, Chen L, Ruas O and Cui B Reliable Data Distillation on Graph Convolutional Network Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, (1399-1414)
- Wampfler R, Klingler S, Solenthaler B, Schinazi V and Gross M Affective State Prediction Based on Semi-Supervised Learning from Smartphone Touch Data Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, (1-13)
- Ghosh S and Ghosh S Exploring the Ideal Depth of Neural Network when Predicting Question Deletion on Community Question Answering Proceedings of the 11th Annual Meeting of the Forum for Information Retrieval Evaluation, (52-55)
- Zhang M, Lucas J, Hinton G and Ba J Lookahead optimizer Proceedings of the 33rd International Conference on Neural Information Processing Systems, (9597-9608)
- Lee S and Nirjon S Neuro.ZERO Proceedings of the 17th Conference on Embedded Networked Sensor Systems, (138-152)
- Wang B and Gong N Attacking Graph-based Classification via Manipulating the Graph Structure Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, (2023-2040)
- Rizk H and Youssef M MonoDCell Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, (109-118)
- Chauhan V, Dahiya K and Sharma A (2019). Problem formulations and solvers in linear SVM, Artificial Intelligence Review, 52:2, (803-855), Online publication date: 1-Aug-2019.
- Gao H, Pei J and Huang H Conditional Random Field Enhanced Graph Convolutional Neural Networks Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (276-284)
- Li J, Guo R, Liu C and Liu H Adaptive Unsupervised Feature Selection on Attributed Networks Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (92-100)
- Katrychuk D, Griffith H and Komogortsev O Power-efficient and shift-robust eye-tracking sensor for portable VR headsets Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications, (1-8)
- Ding P, Zhang Y, Jia P and Chang X (2019). A Comparison, Neural Processing Letters, 49:3, (1369-1379), Online publication date: 1-Jun-2019.
- Liang S Unsupervised Semantic Generative Adversarial Networks for Expert Retrieval The World Wide Web Conference, (1039-1050)
- Ibrahim R and Gleich D Nonlinear Diffusion for Community Detection and Semi-Supervised Learning The World Wide Web Conference, (739-750)
- Fernández C, Salinas L and Torres C (2019). A meta extreme learning machine method for forecasting financial time series, Applied Intelligence, 49:2, (532-554), Online publication date: 1-Feb-2019.
- Vellanki P, Rana S, Gupta S, Leal D, Sutti A, Height M and Venkatesh S Bayesian functional optimisation with shape prior Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (1617-1624)
- Ding M, Tang J and Zhang J Semi-supervised Learning on Graphs with Generative Adversarial Nets Proceedings of the 27th ACM International Conference on Information and Knowledge Management, (913-922)
- Van Gysel C, de Rijke M and Kanoulas E Mix 'n Match Proceedings of the 27th ACM International Conference on Information and Knowledge Management, (1373-1382)
- Jiang W, Miao C, Ma F, Yao S, Wang Y, Yuan Y, Xue H, Song C, Ma X, Koutsonikolas D, Xu W and Su L Towards Environment Independent Device Free Human Activity Recognition Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, (289-304)
- Shams S, Platania R, Kim J, Zhang J, Lee K, Yang S and Park S A Distributed Semi-Supervised Platform for DNase-Seq Data Analytics using Deep Generative Convolutional Networks Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, (244-253)
- Hu Y, Yu Y and Zhou Z Experienced optimization with reusable directional model for hyper-parameter search Proceedings of the 27th International Joint Conference on Artificial Intelligence, (2276-2282)
- Stojanovic V, Trapp M, Richter R and Döllner J A service-oriented approach for classifying 3D points clouds by example of office furniture classification Proceedings of the 23rd International ACM Conference on 3D Web Technology, (1-9)
- Lin H, Wang H, Du D, Wu H, Chang B and Chen E Patent Quality Valuation with Deep Learning Models Database Systems for Advanced Applications, (474-490)
- Ren G, Ni X, Malik M and Ke Q Conversational Query Understanding Using Sequence to Sequence Modeling Proceedings of the 2018 World Wide Web Conference, (1715-1724)
- Park J, Cho H, Jung W and Lee J (2018). Transparent GPU memory management for DNNs, ACM SIGPLAN Notices, 53:1, (411-412), Online publication date: 23-Mar-2018.
- Park J, Cho H, Jung W and Lee J Transparent GPU memory management for DNNs Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (411-412)
- Abel J, Fingscheidt T, Abel J and Fingscheidt T (2018). Artificial Speech Bandwidth Extension Using Deep Neural Networks for Wideband Spectral Envelope Estimation, IEEE/ACM Transactions on Audio, Speech and Language Processing, 26:1, (71-83), Online publication date: 1-Jan-2018.
- Al-Sallab A, Baly R, Hajj H, Shaban K, El-Hajj W and Badaro G (2017). AROMA, ACM Transactions on Asian and Low-Resource Language Information Processing, 16:4, (1-20), Online publication date: 31-Dec-2018.
- Hoffer E, Hubara I and Soudry D Train longer, generalize better Proceedings of the 31st International Conference on Neural Information Processing Systems, (1729-1739)
- Sheng K, Dong W, Li W, Razik J, Huang F and Hu B (2017). Centroid-aware local discriminative metric learning in speaker verification, Pattern Recognition, 72:C, (176-185), Online publication date: 1-Dec-2017.
- Jiang X, de Souza E, Pesaranghader A, Hu B, Silver D and Matwin S TrajectoryNet Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering, (192-200)
- Butnaru A and Ionescu R (2017). From Image to Text Classification, Procedia Computer Science, 112:C, (1783-1792), Online publication date: 1-Sep-2017.
- Xing J, Li K, Hu W, Yuan C and Ling H (2017). Diagnosing deep learning models for high accuracy age estimation from a single image, Pattern Recognition, 66:C, (106-116), Online publication date: 1-Jun-2017.
- Chen Z, Wu H, Gao B, Yao P, Li X and Qian H Neuromorphic Computing based on Resistive RAM Proceedings of the on Great Lakes Symposium on VLSI 2017, (311-315)
- Cha Y, Choi W and Büyüköztürk O (2017). Deep Learning-Based Crack Damage Detection Using Convolutional Neural Networks, Computer-Aided Civil and Infrastructure Engineering, 32:5, (361-378), Online publication date: 1-May-2017.
- Duan Y, Liu F, Jiao L, Zhao P and Zhang L (2017). SAR Image segmentation based on convolutional-wavelet neural network and markov random field, Pattern Recognition, 64:C, (255-267), Online publication date: 1-Apr-2017.
- Layouni M, Hamdi M and Tahar S (2017). Detection and sizing of metal-loss defects in oil and gas pipelines using pattern-adapted wavelets and machine learning, Applied Soft Computing, 52:C, (247-261), Online publication date: 1-Mar-2017.
- Xie D, Zhang L and Bai L (2017). Deep Learning in Visual Computing and Signal Processing, Applied Computational Intelligence and Soft Computing, 2017, (1), Online publication date: 1-Feb-2017.
- Gao J, Yang J, Wang G and Li M (2016). A novel feature extraction method for scene recognition based on Centered Convolutional Restricted Boltzmann Machines, Neurocomputing, 214:C, (708-717), Online publication date: 19-Nov-2016.
- Tang J, Shu X, Li Z, Qi G and Wang J (2016). Generalized Deep Transfer Networks for Knowledge Propagation in Heterogeneous Domains, ACM Transactions on Multimedia Computing, Communications, and Applications, 12:4s, (1-22), Online publication date: 18-Nov-2016.
- Ionescu R (2016). Measuring the Local Non-alignment Between Objects, Procedia Computer Science, 96:C, (838-847), Online publication date: 1-Oct-2016.
- Yang W, Jin L, Tao D, Xie Z and Feng Z (2016). DropSample, Pattern Recognition, 58:C, (190-203), Online publication date: 1-Oct-2016.
- Welchowski T and Schmid M (2016). A framework for parameter estimation and model selection in kernel deep stacking networks, Artificial Intelligence in Medicine, 70:C, (31-40), Online publication date: 1-Jun-2016.
- Van Gysel C, de Rijke M and Worring M Unsupervised, Efficient and Semantic Expertise Retrieval Proceedings of the 25th International Conference on World Wide Web, (1069-1079)
- Gligorijevic D, Stojanovic J and Obradovic Z Uncertainty propagation in long-term structured regression on evolving networks Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, (1603-1609)
- Goswami P, Amini M and Gaussier E Language-independent Query Representation for IR Model Parameter Estimation on Unlabeled Collections Proceedings of the 2015 International Conference on The Theory of Information Retrieval, (121-130)
- Domhan T, Springenberg J and Hutter F Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves Proceedings of the 24th International Conference on Artificial Intelligence, (3460-3468)
- Schmidhuber J (2015). Deep learning in neural networks, Neural Networks, 61:C, (85-117), Online publication date: 1-Jan-2015.
- Shen Y, He X, Gao J, Deng L and Mesnil G A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, (101-110)
- Huang P, He X, Gao J, Deng L, Acero A and Heck L Learning deep structured semantic models for web search using clickthrough data Proceedings of the 22nd ACM international conference on Information & Knowledge Management, (2333-2338)
- Unger M, Li P, Sen S and Tuzhilin A Don’t Need All Eggs in One Basket: Reconstructing Composite Embeddings of Customers from Individual-Domain Embeddings, ACM Transactions on Management Information Systems, 0:0
- Qiao Z, Wang P, Wang P, Ning Z, Fu Y, Du Y, Zhou Y, Huang J, Hua X and Xiong H A Dual-Channel Semi-Supervised Learning Framework on Graphs via Knowledge Transfer and Meta-Learning, ACM Transactions on the Web, 0:0
Recommendations
Granular neural networks
Fuzzy neural networks (FNNs) and rough neural networks (RNNs) both have been hot research topics in the artificial intelligence in recent years. The former imitates the human brain in dealing with problems, the other takes advantage of rough set theory ...
Channel equalization using neural networks: a review
Equalization refers to any signal processing technique used at the receiver to combat intersymbol interference in dispersive channels. This paper reviews the applications of artificial neural networks (ANNs) in modeling nonlinear phenomenon of channel ...