3. Common Operators
Quick search
code
Show Source
All Notebooks PDF Discuss GitHub
Dive into Deep Learning Compiler
Table Of Contents
  • 1. Getting Started
    • 1.1. Installation
    • 1.2. Vector Add
    • 1.3. Neural Network Inference
    • 1.4. Running on a Remote Machine
  • 2. Expressions for Operators
    • 2.1. Data Types
    • 2.2. Shapes
    • 2.3. Index and Shape Expressions
    • 2.4. Reduction Operations
    • 2.5. Conditional Expression: if-then-else
    • 2.6. Truth Value Testing: all and any
  • 3. Common Operators
    • 3.1. Broadcast Add
    • 3.2. Matrix Multiplication
    • 3.3. Convolution
    • 3.4. Depthwise Convolution
    • 3.5. Pooling
    • 3.6. Batch Normalization
  • Operator Optimizations on CPUs
    • 1. CPU Architecture
    • 2. Function Call Overhead
    • 3. Vector Add
    • 4. Broadcast Add
    • 5. Matrix Multiplication
    • 6. Improve Cache Efficiency by Blocking
    • 7. Convolution
    • 8. Packed Convolution
    • 9. Depthwise Convolution
    • 10. Pooling
    • 11. Batch Normalization
  • Operator Optimizations on GPUs
    • 1. GPU Architecture
    • 2. Vector Add
    • 3. Broadcast Add
    • 4. Matrix Multiplication
    • 5. Convolution
    • 6. Depthwise Convolution
    • 7. Pooling
    • 8. Batch Norm
  • 4. Neural Networks
  • 5. Deployment
  • References
Dive into Deep Learning Compiler
Table Of Contents
  • 1. Getting Started
    • 1.1. Installation
    • 1.2. Vector Add
    • 1.3. Neural Network Inference
    • 1.4. Running on a Remote Machine
  • 2. Expressions for Operators
    • 2.1. Data Types
    • 2.2. Shapes
    • 2.3. Index and Shape Expressions
    • 2.4. Reduction Operations
    • 2.5. Conditional Expression: if-then-else
    • 2.6. Truth Value Testing: all and any
  • 3. Common Operators
    • 3.1. Broadcast Add
    • 3.2. Matrix Multiplication
    • 3.3. Convolution
    • 3.4. Depthwise Convolution
    • 3.5. Pooling
    • 3.6. Batch Normalization
  • Operator Optimizations on CPUs
    • 1. CPU Architecture
    • 2. Function Call Overhead
    • 3. Vector Add
    • 4. Broadcast Add
    • 5. Matrix Multiplication
    • 6. Improve Cache Efficiency by Blocking
    • 7. Convolution
    • 8. Packed Convolution
    • 9. Depthwise Convolution
    • 10. Pooling
    • 11. Batch Normalization
  • Operator Optimizations on GPUs
    • 1. GPU Architecture
    • 2. Vector Add
    • 3. Broadcast Add
    • 4. Matrix Multiplication
    • 5. Convolution
    • 6. Depthwise Convolution
    • 7. Pooling
    • 8. Batch Norm
  • 4. Neural Networks
  • 5. Deployment
  • References

3. Common OperatorsΒΆ

In last chapter, we went over how to implement the basic expressions/operators using tvm operators. This chapter will further describe how to implement the typical operators we will encounter in the deep learning models.

  • 3.1. Broadcast Add
    • 3.1.1. Summary
  • 3.2. Matrix Multiplication
    • 3.2.1. Summary
  • 3.3. Convolution
    • 3.3.1. Padding
    • 3.3.2. Convolution
    • 3.3.3. Summary
  • 3.4. Depthwise Convolution
    • 3.4.1. Compute definition
    • 3.4.2. Depthwise Convolution in General
    • 3.4.3. Comparing to Baseline
    • 3.4.4. Summary
  • 3.5. Pooling
    • 3.5.1. Compute definition
    • 3.5.2. MXNet Baseline
    • 3.5.3. Summary
  • 3.6. Batch Normalization
    • 3.6.1. Compute definition
    • 3.6.2. MXNet Baseline
    • 3.6.3. Summary
Previous
2.6. Truth Value Testing: all and any
Next
3.1. Broadcast Add