skip to main content
Skip header Section
Neural Networks: Tricks of the TradeNovember 2012
Publisher:
  • Springer Publishing Company, Incorporated
ISBN:978-3-642-35288-1
Published:06 November 2012
Pages:
781
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

The twenty last years have been marked by an increase in available data and computing power. In parallel to this trend, the focus of neural network research and the practice of training neural networks has undergone a number of important changes, for example, use of deep learning machines. The second edition of the book augments the first edition with more tricks, which have resulted from 14 years of theory and experimentation by some of the world's most prominent neural network researchers. These tricks can make a substantial difference (in terms of speed, ease of implementation, and accuracy) when it comes to putting algorithms to work on real problems.

Cited By

  1. Banerjee K, Singh A, Akhtar N and Vats I (2024). Machine-Learning-Based Accessibility System, SN Computer Science, 5:3, Online publication date: 28-Feb-2024.
  2. ACM
    Malinovsky G, Mishchenko K and Richtárik P Server-Side Stepsizes and Sampling Without Replacement Provably Help in Federated Optimization Proceedings of the 4th International Workshop on Distributed Machine Learning, (85-104)
  3. ACM
    S.K P, Kesanapalli S and Simmhan Y (2022). Characterizing the Performance of Accelerated Jetson Edge Devices for Training Deep Learning Models, Proceedings of the ACM on Measurement and Analysis of Computing Systems, 6:3, (1-26), Online publication date: 1-Dec-2022.
  4. ACM
    Karmaker (“Santu”) S, Hassan M, Smith M, Xu L, Zhai C and Veeramachaneni K (2021). AutoML to Date and Beyond: Challenges and Opportunities, ACM Computing Surveys, 54:8, (1-36), Online publication date: 30-Nov-2022.
  5. ACM
    Petrolo R, Shaikhanov Z, Lin Y and Knightly E (2021). ASTRO, ACM Transactions on Internet of Things, 2:4, (1-22), Online publication date: 30-Nov-2021.
  6. ACM
    Teng Y, Chen H, Yang D, Pignolet Y, Li T and Chen L On Influencing the Influential Proceedings of the 30th ACM International Conference on Information & Knowledge Management, (1804-1813)
  7. ACM
    Li C, Peng X, Peng H, Wu J, Wang L, Yu P, Li J and Sun L Graph-based Semi-Supervised Learning by Strengthening Local Label Consistency Proceedings of the 30th ACM International Conference on Information & Knowledge Management, (3201-3205)
  8. ACM
    Yeh C, Zhuang Z, Wang J, Zheng Y, Ebrahimi J, Mercer R, Wang L and Zhang W Online Multi-horizon Transaction Metric Estimation with Multi-modal Learning in Payment Networks Proceedings of the 30th ACM International Conference on Information & Knowledge Management, (4331-4340)
  9. ACM
    Lakhmiri D, Digabel S and Tribes C (2021). HyperNOMAD, ACM Transactions on Mathematical Software, 47:3, (1-27), Online publication date: 30-Sep-2021.
  10. ACM
    Wang Z, Long C, Cong G and Zhang Q Error-Bounded Online Trajectory Simplification with Multi-Agent Reinforcement Learning Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, (1758-1768)
  11. ACM
    Qian B, Su J, Wen Z, Jha D, Li Y, Guan Y, Puthal D, James P, Yang R, Zomaya A, Rana O, Wang L, Koutny M and Ranjan R (2020). Orchestrating the Development Lifecycle of Machine Learning-based IoT Applications, ACM Computing Surveys, 53:4, (1-47), Online publication date: 31-Jul-2021.
  12. ACM
    Guo B, Ding Y, Yao L, Liang Y and Yu Z (2020). The Future of False Information Detection on Social Media, ACM Computing Surveys, 53:4, (1-36), Online publication date: 31-Jul-2021.
  13. ACM
    Fu S, Liu W, Guan W, Zhou Y, Tao D and Xu C (2021). Dynamic Graph Learning Convolutional Networks for Semi-supervised Classification, ACM Transactions on Multimedia Computing, Communications, and Applications, 17:1s, (1-13), Online publication date: 31-Jan-2021.
  14. ACM
    Zhang L and Lu H A Feature-Importance-Aware and Robust Aggregator for GCN Proceedings of the 29th ACM International Conference on Information & Knowledge Management, (1813-1822)
  15. ACM
    Zhang W, Miao X, Shao Y, Jiang J, Chen L, Ruas O and Cui B Reliable Data Distillation on Graph Convolutional Network Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, (1399-1414)
  16. ACM
    Wampfler R, Klingler S, Solenthaler B, Schinazi V and Gross M Affective State Prediction Based on Semi-Supervised Learning from Smartphone Touch Data Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, (1-13)
  17. ACM
    Ghosh S and Ghosh S Exploring the Ideal Depth of Neural Network when Predicting Question Deletion on Community Question Answering Proceedings of the 11th Annual Meeting of the Forum for Information Retrieval Evaluation, (52-55)
  18. Zhang M, Lucas J, Hinton G and Ba J Lookahead optimizer Proceedings of the 33rd International Conference on Neural Information Processing Systems, (9597-9608)
  19. ACM
    Lee S and Nirjon S Neuro.ZERO Proceedings of the 17th Conference on Embedded Networked Sensor Systems, (138-152)
  20. ACM
    Wang B and Gong N Attacking Graph-based Classification via Manipulating the Graph Structure Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, (2023-2040)
  21. ACM
    Rizk H and Youssef M MonoDCell Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, (109-118)
  22. Chauhan V, Dahiya K and Sharma A (2019). Problem formulations and solvers in linear SVM, Artificial Intelligence Review, 52:2, (803-855), Online publication date: 1-Aug-2019.
  23. ACM
    Gao H, Pei J and Huang H Conditional Random Field Enhanced Graph Convolutional Neural Networks Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (276-284)
  24. ACM
    Li J, Guo R, Liu C and Liu H Adaptive Unsupervised Feature Selection on Attributed Networks Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (92-100)
  25. ACM
    Katrychuk D, Griffith H and Komogortsev O Power-efficient and shift-robust eye-tracking sensor for portable VR headsets Proceedings of the 11th ACM Symposium on Eye Tracking Research & Applications, (1-8)
  26. Ding P, Zhang Y, Jia P and Chang X (2019). A Comparison, Neural Processing Letters, 49:3, (1369-1379), Online publication date: 1-Jun-2019.
  27. ACM
    Liang S Unsupervised Semantic Generative Adversarial Networks for Expert Retrieval The World Wide Web Conference, (1039-1050)
  28. ACM
    Ibrahim R and Gleich D Nonlinear Diffusion for Community Detection and Semi-Supervised Learning The World Wide Web Conference, (739-750)
  29. Fernández C, Salinas L and Torres C (2019). A meta extreme learning machine method for forecasting financial time series, Applied Intelligence, 49:2, (532-554), Online publication date: 1-Feb-2019.
  30. Vellanki P, Rana S, Gupta S, Leal D, Sutti A, Height M and Venkatesh S Bayesian functional optimisation with shape prior Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, (1617-1624)
  31. ACM
    Ding M, Tang J and Zhang J Semi-supervised Learning on Graphs with Generative Adversarial Nets Proceedings of the 27th ACM International Conference on Information and Knowledge Management, (913-922)
  32. ACM
    Van Gysel C, de Rijke M and Kanoulas E Mix 'n Match Proceedings of the 27th ACM International Conference on Information and Knowledge Management, (1373-1382)
  33. ACM
    Jiang W, Miao C, Ma F, Yao S, Wang Y, Yuan Y, Xue H, Song C, Ma X, Koutsonikolas D, Xu W and Su L Towards Environment Independent Device Free Human Activity Recognition Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, (289-304)
  34. ACM
    Shams S, Platania R, Kim J, Zhang J, Lee K, Yang S and Park S A Distributed Semi-Supervised Platform for DNase-Seq Data Analytics using Deep Generative Convolutional Networks Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, (244-253)
  35. Hu Y, Yu Y and Zhou Z Experienced optimization with reusable directional model for hyper-parameter search Proceedings of the 27th International Joint Conference on Artificial Intelligence, (2276-2282)
  36. ACM
    Stojanovic V, Trapp M, Richter R and Döllner J A service-oriented approach for classifying 3D points clouds by example of office furniture classification Proceedings of the 23rd International ACM Conference on 3D Web Technology, (1-9)
  37. Lin H, Wang H, Du D, Wu H, Chang B and Chen E Patent Quality Valuation with Deep Learning Models Database Systems for Advanced Applications, (474-490)
  38. Ren G, Ni X, Malik M and Ke Q Conversational Query Understanding Using Sequence to Sequence Modeling Proceedings of the 2018 World Wide Web Conference, (1715-1724)
  39. ACM
    Park J, Cho H, Jung W and Lee J (2018). Transparent GPU memory management for DNNs, ACM SIGPLAN Notices, 53:1, (411-412), Online publication date: 23-Mar-2018.
  40. ACM
    Park J, Cho H, Jung W and Lee J Transparent GPU memory management for DNNs Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (411-412)
  41. Abel J, Fingscheidt T, Abel J and Fingscheidt T (2018). Artificial Speech Bandwidth Extension Using Deep Neural Networks for Wideband Spectral Envelope Estimation, IEEE/ACM Transactions on Audio, Speech and Language Processing, 26:1, (71-83), Online publication date: 1-Jan-2018.
  42. ACM
    Al-Sallab A, Baly R, Hajj H, Shaban K, El-Hajj W and Badaro G (2017). AROMA, ACM Transactions on Asian and Low-Resource Language Information Processing, 16:4, (1-20), Online publication date: 31-Dec-2018.
  43. Hoffer E, Hubara I and Soudry D Train longer, generalize better Proceedings of the 31st International Conference on Neural Information Processing Systems, (1729-1739)
  44. Sheng K, Dong W, Li W, Razik J, Huang F and Hu B (2017). Centroid-aware local discriminative metric learning in speaker verification, Pattern Recognition, 72:C, (176-185), Online publication date: 1-Dec-2017.
  45. Jiang X, de Souza E, Pesaranghader A, Hu B, Silver D and Matwin S TrajectoryNet Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering, (192-200)
  46. Butnaru A and Ionescu R (2017). From Image to Text Classification, Procedia Computer Science, 112:C, (1783-1792), Online publication date: 1-Sep-2017.
  47. Xing J, Li K, Hu W, Yuan C and Ling H (2017). Diagnosing deep learning models for high accuracy age estimation from a single image, Pattern Recognition, 66:C, (106-116), Online publication date: 1-Jun-2017.
  48. ACM
    Chen Z, Wu H, Gao B, Yao P, Li X and Qian H Neuromorphic Computing based on Resistive RAM Proceedings of the on Great Lakes Symposium on VLSI 2017, (311-315)
  49. Cha Y, Choi W and Büyüköztürk O (2017). Deep Learning-Based Crack Damage Detection Using Convolutional Neural Networks, Computer-Aided Civil and Infrastructure Engineering, 32:5, (361-378), Online publication date: 1-May-2017.
  50. Duan Y, Liu F, Jiao L, Zhao P and Zhang L (2017). SAR Image segmentation based on convolutional-wavelet neural network and markov random field, Pattern Recognition, 64:C, (255-267), Online publication date: 1-Apr-2017.
  51. Layouni M, Hamdi M and Tahar S (2017). Detection and sizing of metal-loss defects in oil and gas pipelines using pattern-adapted wavelets and machine learning, Applied Soft Computing, 52:C, (247-261), Online publication date: 1-Mar-2017.
  52. Xie D, Zhang L and Bai L (2017). Deep Learning in Visual Computing and Signal Processing, Applied Computational Intelligence and Soft Computing, 2017, (1), Online publication date: 1-Feb-2017.
  53. Gao J, Yang J, Wang G and Li M (2016). A novel feature extraction method for scene recognition based on Centered Convolutional Restricted Boltzmann Machines, Neurocomputing, 214:C, (708-717), Online publication date: 19-Nov-2016.
  54. ACM
    Tang J, Shu X, Li Z, Qi G and Wang J (2016). Generalized Deep Transfer Networks for Knowledge Propagation in Heterogeneous Domains, ACM Transactions on Multimedia Computing, Communications, and Applications, 12:4s, (1-22), Online publication date: 18-Nov-2016.
  55. Ionescu R (2016). Measuring the Local Non-alignment Between Objects, Procedia Computer Science, 96:C, (838-847), Online publication date: 1-Oct-2016.
  56. Yang W, Jin L, Tao D, Xie Z and Feng Z (2016). DropSample, Pattern Recognition, 58:C, (190-203), Online publication date: 1-Oct-2016.
  57. Welchowski T and Schmid M (2016). A framework for parameter estimation and model selection in kernel deep stacking networks, Artificial Intelligence in Medicine, 70:C, (31-40), Online publication date: 1-Jun-2016.
  58. Van Gysel C, de Rijke M and Worring M Unsupervised, Efficient and Semantic Expertise Retrieval Proceedings of the 25th International Conference on World Wide Web, (1069-1079)
  59. Gligorijevic D, Stojanovic J and Obradovic Z Uncertainty propagation in long-term structured regression on evolving networks Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, (1603-1609)
  60. ACM
    Goswami P, Amini M and Gaussier E Language-independent Query Representation for IR Model Parameter Estimation on Unlabeled Collections Proceedings of the 2015 International Conference on The Theory of Information Retrieval, (121-130)
  61. Domhan T, Springenberg J and Hutter F Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves Proceedings of the 24th International Conference on Artificial Intelligence, (3460-3468)
  62. Schmidhuber J (2015). Deep learning in neural networks, Neural Networks, 61:C, (85-117), Online publication date: 1-Jan-2015.
  63. ACM
    Shen Y, He X, Gao J, Deng L and Mesnil G A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, (101-110)
  64. ACM
    Huang P, He X, Gao J, Deng L, Acero A and Heck L Learning deep structured semantic models for web search using clickthrough data Proceedings of the 22nd ACM international conference on Information & Knowledge Management, (2333-2338)
  65. ACM
    Unger M, Li P, Sen S and Tuzhilin A Don’t Need All Eggs in One Basket: Reconstructing Composite Embeddings of Customers from Individual-Domain Embeddings, ACM Transactions on Management Information Systems, 0:0
  66. ACM
    Qiao Z, Wang P, Wang P, Ning Z, Fu Y, Du Y, Zhou Y, Huang J, Hua X and Xiong H A Dual-Channel Semi-Supervised Learning Framework on Graphs via Knowledge Transfer and Meta-Learning, ACM Transactions on the Web, 0:0
Contributors

Recommendations