Similar Items: Dense vs Sparse Pretraining at Tiny Scale: Active-Parameter vs Total-Parameter Matching