The Jacobian is a matrix of all the first-order partial derivatives of a vector-valued function. In the neural network case, it is an N-by-W matrix, where N is the number of entries in our training set and W is the total number of parameters (weights + biases) of our network. It can be generated by taking the partial derivatives of each output in respect to each weight, and has the form:
Where F(xi, w) is the network function evaluated for the ith input vector of the training set using the weight vector w and wj is the jth element of the weight vector w of the network.
For more information regarding the Computation of the Jacobian matrix of a neural network in Python, refer to the following link: https://medium.com/unit8-machine-learning-publication/computing-the-jacobian-matrix-of-a-neural-network-in-python-4f162e5db180
In classical Levenberg-Marquardt implementations, the Jacobian is approximated by utilizing the finite differences. But, for neural networks, it can be computed very efficiently by using the chain rule of calculus and the first derivative of the activation functions.