The rationale is to allow more than one top-level class per .java file.
Many classes—such as event listeners—are of local use only and the earliest versions of Java did not support nested classes. Without this relaxation of the "filename = class name" rule, each and every such class would have required its own file, with the unavoidable result of endless proliferation of small .java files and the scattering of tightly coupled code.
As soon as Java introduced nested classes, the importance of this rule waned significantly. Today you can go through many hundreds of Java files, never chancing upon one which takes advantage of it.