Android6.0系统启动流程分析三:SystemServer进程

时间:2022-07-29 16:08:22

在上一篇博客 Android6.0系统启动流程分析二:zygote进程一文中,我们队Zygote进程的有了一定的了解。我们知道Zygote进程会启动SystemServer进程,但我们并没有在上篇文章中分析SystemServer进程的相关内容。这篇博客,我们将目标汇聚在SystemServer进程上,看看这个进程都做了什么事情。
SystemServer启动流程参考如下时序图:
Android6.0系统启动流程分析三:SystemServer进程
下面将分阶段对SysteServer进行分析。

SystemServer进程分析

在ZygoteInit的main方法中,启动了SystemServer进程:

            ...
if (startSystemServer) {
startSystemServer(abiList, socketName);
}
...

启动使用的是startSystemServer方法,这个方法也定义在ZygoteInit类中:

   /**
* Prepare the arguments and fork for the system server process.
*/

private static boolean startSystemServer(String abiList, String socketName)
throws MethodAndArgsCaller, RuntimeException {
long capabilities = posixCapabilitiesAsBits(
OsConstants.CAP_BLOCK_SUSPEND,
OsConstants.CAP_KILL,
OsConstants.CAP_NET_ADMIN,
OsConstants.CAP_NET_BIND_SERVICE,
OsConstants.CAP_NET_BROADCAST,
OsConstants.CAP_NET_RAW,
OsConstants.CAP_SYS_MODULE,
OsConstants.CAP_SYS_NICE,
OsConstants.CAP_SYS_RESOURCE,
OsConstants.CAP_SYS_TIME,
OsConstants.CAP_SYS_TTY_CONFIG
);
/* Hardcoded command line to start the system server */
String args[] = {
"--setuid=1000",
"--setgid=1000",
"--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1032,3001,3002,3003,3006,3007",
"--capabilities=" + capabilities + "," + capabilities,
"--nice-name=system_server",
"--runtime-args",
"com.android.server.SystemServer",
};
ZygoteConnection.Arguments parsedArgs = null;

int pid;

try {
parsedArgs = new ZygoteConnection.Arguments(args);
ZygoteConnection.applyDebuggerSystemProperty(parsedArgs);
ZygoteConnection.applyInvokeWithSystemProperty(parsedArgs);

/* Request to fork the system server process */
pid = Zygote.forkSystemServer(
parsedArgs.uid, parsedArgs.gid,
parsedArgs.gids,
parsedArgs.debugFlags,
null,
parsedArgs.permittedCapabilities,
parsedArgs.effectiveCapabilities);
} catch (IllegalArgumentException ex) {
throw new RuntimeException(ex);
}

/* For child process */
if (pid == 0) {
if (hasSecondZygote(abiList)) {
waitForSecondaryZygote(socketName);
}

handleSystemServerProcess(parsedArgs);
}

return true;
}

这个方法做了两件事情:
1. 创建SystemServer进程:Zygote.forkSystemServer
2. 做后续的初始化工作:handleSystemServerProcess

创建SystemServer进程

SystemServer进程的创建入口是forkSystemServer方法,这个方法定义在frameworks\base\core\java\com\android\internal\os\Zygote中:

    public static int forkSystemServer(int uid, int gid, int[] gids, int debugFlags,
int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {
VM_HOOKS.preFork();
int pid = nativeForkSystemServer(
uid, gid, gids, debugFlags, rlimits, permittedCapabilities, effectiveCapabilities);
// Enable tracing as soon as we enter the system_server.
if (pid == 0) {
Trace.setTracingEnabled(true);
}
VM_HOOKS.postForkCommon();
return pid;
}

这个方法做的三件事情:
1. VM_HOOKS.preFork(),做准备工作。
2. nativeForkSystemServer,创建子进程
3. VM_HOOKS.postForkCommon();启动Zygote的4个Daemon线程,java堆整理,引用队列,以及析构线程。

和我们在上一篇博客 Android6.0系统启动流程分析二:zygote进程中分析的”Zygote创建子进程的过程”一节中分析到Zygote进程创建一个子进程(SystemServer除外的进程)也需要做三件事:
1. VM_HOOKS.preFork(),做准备工作。
2. nativeForkAndSpecialize,创建子进程
3. VM_HOOKS.postForkCommon();启动Zygote的4个Daemon线程,java堆整理,引用队列,以及析构线程。
对比可见这两个过程非常的相似,只有调用创建子进程的方法不一样,因此,这一步我们只分析nativeForkSystemServer方法,其他的请参考上一篇博客的”Zygote创建子进程的过程”。

nativeForkSystemServer方法定义在frameworks\base\core\jni\com_android_internal_os_Zygote.cpp 中:

static jint com_android_internal_os_Zygote_nativeForkSystemServer(
JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,
jint debug_flags, jobjectArray rlimits, jlong permittedCapabilities,
jlong effectiveCapabilities) {
pid_t pid = ForkAndSpecializeCommon(env, uid, gid, gids,
debug_flags, rlimits,
permittedCapabilities, effectiveCapabilities,
MOUNT_EXTERNAL_DEFAULT, NULL, NULL, true, NULL,
NULL, NULL);
if (pid > 0) {
// The zygote process checks whether the child process has died or not.
ALOGI("System server process %d has been created", pid);
gSystemServerPid = pid;
// There is a slight window that the system server process has crashed
// but it went unnoticed because we haven't published its pid yet. So
// we recheck here just to make sure that all is well.
int status;
if (waitpid(pid, &status, WNOHANG) == pid) {
ALOGE("System server process %d has died. Restarting Zygote!", pid);
RuntimeAbort(env);
}
}
return pid;
}

对比下创建其他子进程时的过程,我么发现创建SystemServer进程使用的是nativeForkSystemServer方法,也就是上面这个方法,而创建普通进程使用的是nativeForkAndSpecialize方法,我们对比下这个方法:

static jint com_android_internal_os_Zygote_nativeForkAndSpecialize(
JNIEnv* env, jclass, jint uid, jint gid, jintArray gids,
jint debug_flags, jobjectArray rlimits,
jint mount_external, jstring se_info, jstring se_name,
jintArray fdsToClose, jstring instructionSet, jstring appDataDir) {
// Grant CAP_WAKE_ALARM to the Bluetooth process.
jlong capabilities = 0;
if (uid == AID_BLUETOOTH) {
capabilities |= (1LL << CAP_WAKE_ALARM);
}

return ForkAndSpecializeCommon(env, uid, gid, gids, debug_flags,
rlimits, capabilities, capabilities, mount_external, se_info,
se_name, false, fdsToClose, instructionSet, appDataDir);
}

可以看到创建SystemServer多了如下代码:

  if (pid > 0) {
// The zygote process checks whether the child process has died or not.
ALOGI("System server process %d has been created", pid);
gSystemServerPid = pid;
// There is a slight window that the system server process has crashed
// but it went unnoticed because we haven't published its pid yet. So
// we recheck here just to make sure that all is well.
int status;
if (waitpid(pid, &status, WNOHANG) == pid) {
ALOGE("System server process %d has died. Restarting Zygote!", pid);
RuntimeAbort(env);
}
}

这段代码是在父进程中执行的,如今成会检测子进程有没有挂掉,因为SystemServer非常重要,所以它的生死必须非常重视。
从中我们可以知道SystemServer和其他进程的创建过程并没有什么重大的区别,只不过由于SystemServer进程的重要性,Zygote进程会对他“多加关照”。
既然他们的创建过程没有多大的区别,那么问题是:普通的进程创建后会执行ActivityThread的main方法,那么SystemServer进程会执行什么方法呢?

SystemServer进程的后续初始化

后续初始化工作是从handleSystemServerProcess方法开始的。这个方法定义在frameworks\base\core\java\com\android\internal\os\ZygoteInit类中:

   private static void handleSystemServerProcess(
ZygoteConnection.Arguments parsedArgs)
throws ZygoteInit.MethodAndArgsCaller {

closeServerSocket();

// set umask to 0077 so new files and directories will default to owner-only permissions.
Os.umask(S_IRWXG | S_IRWXO);

if (parsedArgs.niceName != null) {
Process.setArgV0(parsedArgs.niceName);
}

final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");
if (systemServerClasspath != null) {
performSystemServerDexOpt(systemServerClasspath);
}

if (parsedArgs.invokeWith != null) {
String[] args = parsedArgs.remainingArgs;
// If we have a non-null system server class path, we'll have to duplicate the
// existing arguments and append the classpath to it. ART will handle the classpath
// correctly when we exec a new process.
if (systemServerClasspath != null) {
String[] amendedArgs = new String[args.length + 2];
amendedArgs[0] = "-cp";
amendedArgs[1] = systemServerClasspath;
System.arraycopy(parsedArgs.remainingArgs, 0, amendedArgs, 2, parsedArgs.remainingArgs.length);
}

WrapperInit.execApplication(parsedArgs.invokeWith,
parsedArgs.niceName, parsedArgs.targetSdkVersion,
VMRuntime.getCurrentInstructionSet(), null, args);
} else {
ClassLoader cl = null;
if (systemServerClasspath != null) {
cl = new PathClassLoader(systemServerClasspath, ClassLoader.getSystemClassLoader());
Thread.currentThread().setContextClassLoader(cl);
}

/*
* Pass the remaining arguments to SystemServer.
*/

RuntimeInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);
}

/* should never reach here */
}

这个方法做两件事情:
1.关掉从父进程继承下来的socket。
2调用RuntimeInit.zygoteInit做进一步初始化。
zygoteInit方法定义在frameworks\base\core\java\com\android\internal\os\RuntimeInit类中:

    public static final void zygoteInit(int targetSdkVersion, String[] argv, ClassLoader classLoader)
throws ZygoteInit.MethodAndArgsCaller {
if (DEBUG) Slog.d(TAG, "RuntimeInit: Starting application from zygote");

Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "RuntimeInit");
redirectLogStreams();

commonInit();
nativeZygoteInit();
applicationInit(targetSdkVersion, argv, classLoader);
}

这个方法在上一节分析Zygote进程创建普通子进程的时候已经分析过了,它的关键在applicationInit方法中。
applicationInit方法如下:

  private static void applicationInit(int targetSdkVersion, String[] argv, ClassLoader classLoader)
throws ZygoteInit.MethodAndArgsCaller {
// If the application calls System.exit(), terminate the process
// immediately without running any shutdown hooks. It is not possible to
// shutdown an Android application gracefully. Among other things, the
// Android runtime shutdown hooks close the Binder driver, which can cause
// leftover running threads to crash before the process actually exits.
nativeSetExitWithoutCleanup(true);

// We want to be fairly aggressive about heap utilization, to avoid
// holding on to a lot of memory that isn't needed.
VMRuntime.getRuntime().setTargetHeapUtilization(0.75f);
VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);

final Arguments args;
try {
args = new Arguments(argv);
} catch (IllegalArgumentException ex) {
Slog.e(TAG, ex.getMessage());
// let the process exit
return;
}

// The end of of the RuntimeInit event (see #zygoteInit).
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);

// Remaining arguments are passed to the start class's static main
invokeStaticMain(args.startClass, args.startArgs, classLoader);
}

这个方法会调用invokeStaticMain来调用SystemServer的main方法,这么说是因为args.startClass的值就是“com.android.server.SystemServer”。我们在调用startSystemServer启动SystemServer进程之初就给他指明了参数:

        String args[] = {
"--setuid=1000",
"--setgid=1000",
"--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1032,3001,3002,3003,3006,3007",
"--capabilities=" + capabilities + "," + capabilities,
"--nice-name=system_server",
"--runtime-args",
"com.android.server.SystemServer",
};

少许跟踪就会知道这里的args.startClass=“com.android.server.SystemServer”。
invokeStaticMain方法如下:

    private static void invokeStaticMain(String className, String[] argv, ClassLoader classLoader)
throws ZygoteInit.MethodAndArgsCaller {
Class<?> cl;

try {
cl = Class.forName(className, true, classLoader);
} catch (ClassNotFoundException ex) {
throw new RuntimeException(
"Missing class when invoking static main " + className,
ex);
}

Method m;
try {
m = cl.getMethod("main", new Class[] { String[].class });
} catch (NoSuchMethodException ex) {
throw new RuntimeException(
"Missing static main on " + className, ex);
} catch (SecurityException ex) {
throw new RuntimeException(
"Problem getting static main on " + className, ex);
}

int modifiers = m.getModifiers();
if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {
throw new RuntimeException(
"Main method is not public and static on " + className);
}

/*
* This throw gets caught in ZygoteInit.main(), which responds
* by invoking the exception's run() method. This arrangement
* clears up all the stack frames that were required in setting
* up the process.
*/

throw new ZygoteInit.MethodAndArgsCaller(m, argv);
}

这个方法首先会根据类名找到Class对象,然后获得这个类的main方法,也就是com.android.server.SystemServer的main方法了。最后抛出一个异常,这个异常定义如下:

    public static class MethodAndArgsCaller extends Exception
implements Runnable {

/** method to call */
private final Method mMethod;

/** argument array */
private final String[] mArgs;

public MethodAndArgsCaller(Method method, String[] args) {
mMethod = method;
mArgs = args;
}

public void run() {
try {
mMethod.invoke(null, new Object[] { mArgs });
} catch (IllegalAccessException ex) {
throw new RuntimeException(ex);
} catch (InvocationTargetException ex) {
Throwable cause = ex.getCause();
if (cause instanceof RuntimeException) {
throw (RuntimeException) cause;
} else if (cause instanceof Error) {
throw (Error) cause;
}
throw new RuntimeException(ex);
}
}
}
}

这个异常是一个Runnable,这个异常抛出以后,会在ZygoteInit的main方法中捕获到:

catch (MethodAndArgsCaller caller) {
caller.run();
}

捕获的异常后会执行异常的run方法,这个方法已经贴过了,会在这个方法中通过mMethod.invoke(null, new Object[] { mArgs });来启动SystemServer的main方法。
这样,SystemServer进程就来到com.android.server.SystemServer的main方法了.

执行SystemServer的main方法

SystemServer类定义在frameworks\base\services\java\com\android\server目录下,其main方法如下:

    public static void main(String[] args) {
new SystemServer().run();
}

执行SystemServer的run方法,run方法如下:

   private void run() {
// If a device's clock is before 1970 (before 0), a lot of
// APIs crash dealing with negative numbers, notably
// java.io.File#setLastModified, so instead we fake it and
// hope that time from cell towers or NTP fixes it shortly.
if (System.currentTimeMillis() < EARLIEST_SUPPORTED_TIME) {
Slog.w(TAG, "System clock is before 1970; setting to 1970.");
SystemClock.setCurrentTimeMillis(EARLIEST_SUPPORTED_TIME);
}

// If the system has "persist.sys.language" and friends set, replace them with
// "persist.sys.locale". Note that the default locale at this point is calculated
// using the "-Duser.locale" command line flag. That flag is usually populated by
// AndroidRuntime using the same set of system properties, but only the system_server
// and system apps are allowed to set them.
//
// NOTE: Most changes made here will need an equivalent change to
// core/jni/AndroidRuntime.cpp
if (!SystemProperties.get("persist.sys.language").isEmpty()) {
final String languageTag = Locale.getDefault().toLanguageTag();

SystemProperties.set("persist.sys.locale", languageTag);
SystemProperties.set("persist.sys.language", "");
SystemProperties.set("persist.sys.country", "");
SystemProperties.set("persist.sys.localevar", "");
}

// Here we go!
Slog.i(TAG, "Entered the Android system server!");
EventLog.writeEvent(EventLogTags.BOOT_PROGRESS_SYSTEM_RUN, SystemClock.uptimeMillis());

// In case the runtime switched since last boot (such as when
// the old runtime was removed in an OTA), set the system
// property so that it is in sync. We can't do this in
// libnativehelper's JniInvocation::Init code where we already
// had to fallback to a different runtime because it is
// running as root and we need to be the system user to set
// the property. http://b/11463182
SystemProperties.set("persist.sys.dalvik.vm.lib.2", VMRuntime.getRuntime().vmLibrary());

// Enable the sampling profiler.
if (SamplingProfilerIntegration.isEnabled()) {
SamplingProfilerIntegration.start();
mProfilerSnapshotTimer = new Timer();
mProfilerSnapshotTimer.schedule(new TimerTask() {
@Override
public void run() {
SamplingProfilerIntegration.writeSnapshot("system_server", null);
}
}, SNAPSHOT_INTERVAL, SNAPSHOT_INTERVAL);
}

// Mmmmmm... more memory!
VMRuntime.getRuntime().clearGrowthLimit();

// The system server has to run all of the time, so it needs to be
// as efficient as possible with its memory usage.
VMRuntime.getRuntime().setTargetHeapUtilization(0.8f);

// Some devices rely on runtime fingerprint generation, so make sure
// we've defined it before booting further.
Build.ensureFingerprintProperty();

// Within the system server, it is an error to access Environment paths without
// explicitly specifying a user.
Environment.setUserRequired(true);

// Ensure binder calls into the system always run at foreground priority.
BinderInternal.disableBackgroundScheduling(true);

// Prepare the main looper thread (this thread).
android.os.Process.setThreadPriority(
android.os.Process.THREAD_PRIORITY_FOREGROUND);
android.os.Process.setCanSelfBackground(false);
Looper.prepareMainLooper();

// Initialize native services.
System.loadLibrary("android_servers");

// Check whether we failed to shut down last time we tried.
// This call may not return.
performPendingShutdown();

// Initialize the system context.
createSystemContext();

// Create the system service manager.
mSystemServiceManager = new SystemServiceManager(mSystemContext);
LocalServices.addService(SystemServiceManager.class, mSystemServiceManager);

// Start services.
try {
startBootstrapServices();
startCoreServices();
startOtherServices();
} catch (Throwable ex) {
Slog.e("System", "******************************************");
Slog.e("System", "************ Failure starting system services", ex);
throw ex;
}

// For debug builds, log event loop stalls to dropbox for analysis.
if (StrictMode.conditionallyEnableDebugLogging()) {
Slog.i(TAG, "Enabled StrictMode for system server main thread.");
}

// Loop forever.
Looper.loop();
throw new RuntimeException("Main thread loop unexpectedly exited");
}

这个类做的事情很多了,我能理解的有:
1. 初始化Looper:Looper.prepareMainLooper();
2. 装载本地库:System.loadLibrary(“android_servers”);
3. 向系统注册SystemService进程:

        // Create the system service manager.
mSystemServiceManager = new SystemServiceManager(mSystemContext);
LocalServices.addService(SystemServiceManager.class, mSystemServiceManager);

SystemService管理系统中所有的Service,它必须其他Service注册之前被注册。
4.启动其他的系统服务:

            startBootstrapServices();
startCoreServices();
startOtherServices();

下面简要浏览下这几个方法启动的服务:
4-1 startBootstrapServices简要浏览下:

  private void startBootstrapServices() {
// Wait for installd to finish starting up so that it has a chance to
// create critical directories such as /data/user with the appropriate
// permissions. We need this to complete before we initialize other services.
Installer installer = mSystemServiceManager.startService(Installer.class);

// Activity manager runs the show.
mActivityManagerService = mSystemServiceManager.startService(
ActivityManagerService.Lifecycle.class).getService();
mActivityManagerService.setSystemServiceManager(mSystemServiceManager);
mActivityManagerService.setInstaller(installer);

// Power manager needs to be started early because other services need it.
// Native daemons may be watching for it to be registered so it must be ready
// to handle incoming binder calls immediately (including being able to verify
// the permissions for those calls).
mPowerManagerService = mSystemServiceManager.startService(PowerManagerService.class);

// Now that the power manager has been started, let the activity manager
// initialize power management features.
mActivityManagerService.initPowerManagement();

// Manages LEDs and display backlight so we need it to bring up the display.
mSystemServiceManager.startService(LightsService.class);
...

这里面的都是重量级服务
A.ActivityManagerService,Activity管理服务
B.创建并注册apk安装服务,并想ActivityManagerService注册。
C.注册电源管理服务
D.LED服务
E.PowerManagerService,包管理服务

4-2 startCoreServices:

    private void startCoreServices() {
// Tracks the battery level. Requires LightService.
mSystemServiceManager.startService(BatteryService.class);

// Tracks application usage stats.
mSystemServiceManager.startService(UsageStatsService.class);
mActivityManagerService.setUsageStatsManager(
LocalServices.getService(UsageStatsManagerInternal.class));
// Update after UsageStatsService is available, needed before performBootDexOpt.
mPackageManagerService.getUsageStatsIfNoPackageUsageInfo();

// Tracks whether the updatable WebView is in a ready state and watches for update installs.
mSystemServiceManager.startService(WebViewUpdateService.class);
}

核心服务并不多:
A.电池管理服务
B.跟踪应用程序使用状态的服务
C.WebViewUpdateService

4-3 startOtherServices

       try {
Slog.i(TAG, "Reading configuration...");
SystemConfig.getInstance();

Slog.i(TAG, "Scheduling Policy");
ServiceManager.addService("scheduling_policy", new SchedulingPolicyService());

mSystemServiceManager.startService(TelecomLoaderService.class);

Slog.i(TAG, "Telephony Registry");
telephonyRegistry = new TelephonyRegistry(context);
ServiceManager.addService("telephony.registry", telephonyRegistry);

Slog.i(TAG, "Entropy Mixer");
entropyMixer = new EntropyMixer(context);

mContentResolver = context.getContentResolver();

Slog.i(TAG, "Camera Service");
mSystemServiceManager.startService(CameraService.class);

// The AccountManager must come before the ContentService
try {
// TODO: seems like this should be disable-able, but req'd by ContentService
Slog.i(TAG, "Account Manager");
accountManager = new AccountManagerService(context);
ServiceManager.addService(Context.ACCOUNT_SERVICE, accountManager);
} catch (Throwable e) {
Slog.e(TAG, "Failure starting Account Manager", e);
}

这个方法中启动的服务非常的多,蓝牙服务,Wifi服务,SystemUI服务等都在这里启动。请读者自行了解。

启动launcher和发送开机完成广播

这个方法中有个很重要的点需要了解:那就是启动Luncher与发送开机广播始于此。

        // We now tell the activity manager it is okay to run third party
// code. It will call back into us once it has gotten to the state
// where third party code can really run (but before it has actually
// started launching the initial applications), for us to complete our
// initialization.
mActivityManagerService.systemReady(new Runnable() {
@Override
public void run() {
Slog.i(TAG, "Making services ready");
mSystemServiceManager.startBootPhase(
SystemService.PHASE_ACTIVITY_MANAGER_READY);

try {
mActivityManagerService.startObservingNativeCrashes();
} catch (Throwable e) {
reportWtf("observing native crashes", e);
}

Slog.i(TAG, "WebViewFactory preparation");
WebViewFactory.prepareWebViewInSystemServer();

try {
startSystemUi(context);
} catch (Throwable e) {
reportWtf("starting System UI", e);
}
...

这里,当所有系统准备好了以后,就会调用ActivityManagerService.systemReady方法,这个方法当如下:

   public void systemReady(final Runnable goingCallback) {
synchronized(this) {
if (mSystemReady) {
// If we're done calling all the receivers, run the next "boot phase" passed in
// by the SystemServer
if (goingCallback != null) {
goingCallback.run();
}
return;
}

mLocalDeviceIdleController
= LocalServices.getService(DeviceIdleController.LocalService.class);

...
// Start up initial activity.
mBooting = true;
startHomeActivityLocked(mCurrentUserId, "systemReady");

...

这个方法非常长,但是重点的是它会调用startHomeActivityLocked方法,这个方法最终会启动Launcher。至于开机完成广播的发送过程,由于code比较复杂,这里就不展开了。

做完这些事情以后,SystemServer就进入了Looper.loop()中了。有消息来就执行,没有消息来就睡眠。