In Keras, input layer is a tensor. And should have the same shape as training data.
So the input shape is the one we have to define because only the user knows it as it's based on training data. Everything else is calculated automatically by model.
If the input shape is in 1-D then you can just use input_dim as a scalar number and there's no need to use tuple. But with tensors dim refers to the dimension of tensor.