Abstract: Consider end-to-end training of a multi-modal vs. a uni-modal network on a task with multiple input modalities: the multi-modal network receives more information, so it should match or ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results